big_data_sig
LOGS
15:01:34 <tstclair> #startmeeting
15:01:34 <zodbot> Meeting started Thu Dec 19 15:01:34 2013 UTC.  The chair is tstclair. Information about MeetBot at http://wiki.debian.org/MeetBot.
15:01:34 <zodbot> Useful Commands: #action #agreed #halp #info #idea #link #topic.
15:01:35 <mizdebsk> hello
15:01:45 <tstclair> #meetingname BIG_DATA_SIG
15:01:45 <zodbot> The meeting name has been set to 'big_data_sig'
15:02:00 <mattf> howdy
15:02:08 <tstclair> #meetingtopic status_update
15:02:26 <tstclair> #topic collection_agenda
15:02:27 <mattf> fedora 20 released w/ hadoop !
15:03:12 <tstclair> rsquared, I have 1 item for the agenda - Apparently someone from the field is having issues.
15:03:27 <mattf> tstclair, is that topic and meetingtopic right?
15:03:32 <rsquared> tstclair: Regarding the datanode?
15:04:01 <tstclair> mattf, - collecting agenda items atm.
15:04:21 <tstclair> feel free to chime in and we'll roll through the topics.
15:04:42 <tstclair> Agenda Item 2 - Review status.
15:04:55 <tstclair> Agenda Item 3 - Issues
15:05:15 <tstclair> Anything else folks want to discuss.
15:05:18 <tstclair> rsquared, yup
15:06:10 <mattf> willb and i went to the env & stack wg meeting this week, we could give a bit of an update
15:07:22 <tstclair> Kyle, lets roll through them and we can always add on.
15:07:30 <tstclair> tap complete..
15:07:44 <tstclair> Kyle -> k
15:08:02 <tstclair> #topic datanode_fail
15:08:26 <mattf> i wasn't able to reproduce it. i used a docker managed container instead of a full system though
15:08:28 <tstclair> rsquared, any insight here.  I've recently updated and been running for some time.
15:09:15 <rsquared> tstclair: I'd need to see the log.  I'd guess he didn't start the namenode after formatting it and before starting the datanode, but without more details I can't say much.  I was waiting for him to respond to matt's request for log info
15:10:06 <tstclair> agreed.
15:10:22 <tstclair> rsquared, he also made the off-the-cuff comment re:dep chain.
15:11:01 <tstclair> have we compared full dep-chain vs other distros?
15:11:48 <rsquared> I wonder if he formated the namenode but never ran hdfs-create-dirs
15:12:29 <rsquared> Not sure the dep chain comparison is all that relevant.  Fedora may package differently, and pulling in multiple hundreds of packages for a single entity isn't uncommon.
15:12:35 <rsquared> Even outside of java
15:12:36 <mattf> afaik hdfs-create-dirs wouldn't make a difference until he tries to run a job
15:12:53 <willb> Seems like a lot of things need to happen in order on initial setup; could we provide a script?
15:13:43 <tstclair> rsquared, ^
15:14:00 <rsquared> willb: We have one.   hdfs-create-dirs
15:14:08 <rsquared> It will format and setup the namenode.
15:14:20 <rsquared> Then you just need to start the services as you require
15:14:46 <mattf> ahh, kinda hdfs-setup, because it does the initial service start
15:15:00 <mattf> all other times you have to either set the service to start on boot or start it manually
15:16:08 <tstclair> 1-time pad.
15:16:25 <rsquared> What hdfs-create-dirs does it format the namenode then create dirs that are needed in hdfs.  It will start the namenode if needed or leave it running if it was found running.  I'm open to a different name if that'll help clear things up.
15:17:07 <rsquared> If hdfs-create-dirs starts the namenode to do the dir creation, it will shut it down afterwards.
15:17:14 <mattf> you know what would be nice, if it was integrated into the hdfs command
15:17:57 <mattf> hdfs namenode -format && hdfs namenode -populate-fs, or somesuch
15:18:26 <willb> philosophical point:  there probably shouldn't be n different ways to set this up without one obviously correct one
15:19:02 <mattf> +1 ^^
15:19:32 <mattf> when we looked at this before, all the distros seemed to just document a handful of steps that rsquared automated into hdfs-create-dirs
15:20:34 <rsquared> bigtop was moving to providing a script for setup as well.  It's probably in the 0.7.0 release.
15:21:13 <tstclair> I think at this point we're kind of splitting hairs, as none of us have repro'd and many are running.
15:21:15 <mattf> 0.7.0 was released, iirc
15:21:23 <mattf> tstclair, aye
15:21:42 <tstclair> #topic status
15:21:44 <rsquared> Yep.  I've looked at 0.7.0 but haven't verified the script is in their release.
15:22:38 <mattf> status - congrats on getting hadoop into F20, released this week!
15:23:23 <tstclair> update: I've been working on a blog post to outline MR w/tachyon, in that process I found several oddities that I have not seen before b/c my original smoke testing wasn't very rough.  In digging deeper I've started filing issues upstream.
15:24:33 <tstclair> I'll likely push out updates to package and blog once everything is working.
15:24:44 <tstclair> as expected.
15:24:52 <rsquared> status - Fought battles with rawhide and hopefully in the final few tests before submitting hbase for review
15:25:02 <willb> update: since last time, I got my patch for sbt upstream (to support Ivy 2.3.0) and have been grinding through my review and packaging backlog
15:25:08 <mattf> i tried out hadoop in a docker container and failed - the current docker containers don't support "system" containers, one where systemd is run, so there's no systemctl start ..., you have to manually start the services
15:25:40 <tstclair> mattf, I'd expect failure honestly.
15:25:43 <mattf> i also spun some f20 kvm instances w/ virt-install, that worked pretty well except i didn't give them enough memory and a basic pi job nearly took down my laptop
15:26:17 <pmackinn> status - 1) datanucleus-{core,api-jdo,rdbms} in rawhide, creeping into f20 2) hive spec development based on latest 0.12.0 3) toyed with mizdebsk ivy xmvn resolver from bz 1012612; ran into CL issues, will have another go at some point
15:26:17 <tstclair> Has anyone thought about the systemd <> Docker issue?
15:26:54 <mattf> supposedly systemd w/i a docker container can work w/ the new libvirt-lxc backend to docker, but it's not upstream yet
15:27:13 <tstclair> good-to-know
15:28:29 <mizdebsk> latest xmvn snapshots (not yet in rawhide) have initial integration with ivy; resolution of system artifacts from ant with ivy tasks or from sbt works
15:28:37 <tstclair> status update 2: working on patches with to update mesos build and enable c++11.  Conversed with folks around ecosystem, features, etc.
15:28:47 <mizdebsk> if i get some confirmation that it works for you too i'll release it in rawhide and maybe f20
15:29:12 <pmackinn> mizdebsk, got a test suite for that? didn't see one in gh.c
15:29:42 <willb> mizdebsk, it's on my list to check out as soon as I can, but I'd love to see it in F20 as well as rawhide
15:29:57 <mizdebsk> pmackinn: not commited yet
15:30:07 <pmackinn> mizdebsk, ack
15:31:13 <mizdebsk> willb: i checked it with bootstrapped sbt in rawhide and sbt was able to resolve stuff from xmvn
15:31:40 <willb> mizdebsk, that is great!
15:31:50 <mizdebsk> (i don't know how to ass extra libs to sbt classpath, so i put them in scala/lib dir)
15:31:59 <mizdebsk> s/ass/add
15:32:52 <willb> I'll look at integration
15:33:02 <willb> another update:  mattf and I attended the Env and Stacks WG meeting and raised some of the concerns that we've seen around SIG packaging efforts:  poor integration of non-"standard" language-specific dependency managers into Fedora, upstreams maintaining compatibility with older JVMs (and thus being tied to older libraries than are in Fedora), etc.  They are going to be looking at our SIG's use cases as they flesh out a plan
15:33:02 <willb> for better integrating new language ecosystems in Fedora.
15:34:59 <tstclair> #topic issues
15:35:06 <tstclair> anything blocking?
15:36:55 <tstclair> If nothing, then we can end a bit early today.?.?
15:38:45 <tstclair> Kyle, I think we'll call it then.  Thanks everyone!  Happy Holidays, & Happy New Year!
15:39:21 <pmackinn> Happy Holidays Kyle!
15:39:29 <tstclair> #endmeeting