14:00:17 <tstclair> #startmeeting 14:00:17 <zodbot> Meeting started Thu May 8 14:00:17 2014 UTC. The chair is tstclair. Information about MeetBot at http://wiki.debian.org/MeetBot. 14:00:17 <zodbot> Useful Commands: #action #agreed #halp #info #idea #link #topic. 14:00:31 <tstclair> #meetingname BIGDATASIG 14:00:31 <zodbot> The meeting name has been set to 'bigdatasig' 14:00:41 <tstclair> #meetingtopic status 14:01:10 <tstclair> Morning folks, want to do a quick status update? 14:01:55 <tstclair> mattf, rsquared, willb__, pmackinn, et. all 14:02:00 * pmackinn ambari 1.5.1 in f20, rawhide; number80 working on my sqoop review 14:02:09 * willb__ is hoping to have draft Scala packaging guidelines by the next time we meet 14:02:47 <tstclair> willb, nice. 14:02:49 <rsquared> Hadoop 2.4.0 is getting close. 2 obstacles remain: Web portals (closing in on a fix), and leveldbjni-all is broken. The later affects the new timeline server. I might be able to get around it, but I won't hold things up. 14:03:24 <tstclair> Updating glusterfs-hadoop & tachyon blocked on Hadoop 2.4 transition, no hurry. 14:04:47 <tstclair> Hoping to cleanup and get the rest of my patches for Mesos upstream so we pull direct from upstream for 0.19.0 release. 14:05:06 <willb> awesome 14:06:29 <tstclair> Anything else noteworthy that we should be tracking? 14:06:30 <mattf> i'm going to be in atlanta for openstack summit next week, talking a lot about big data & ...openstack 14:07:32 <mattf> christopher tubbs is packaging accumulo for f21 and has added it to the sig packaging page 14:08:54 <mattf> also, i'll toot his horn for him, willb will be speaking at spark summit about his cycling app that uses fedora & spark 14:09:18 <willb> thanks mattf 14:09:24 <mattf> s'all i got atm 14:09:36 <tstclair> congrats again willb! 14:10:20 <tstclair> willb, pmackinn, rsquared, mattf - Any other group topics ? 14:10:31 <rsquared> Nothing from me 14:10:35 <pmackinn> nope 14:11:05 <willb> as usual, anyone looking for stuff to package should check out the wishlist on the Scala page :-) 14:11:18 <mattf> the topic of epel and centos seems to come up from time to time 14:11:25 <tstclair> willb, is there a link on: https://fedoraproject.org/wiki/SIGs/bigdata/packaging ? 14:11:56 <willb> tstclair, yes, but the list is at the bottom of this page: https://fedoraproject.org/wiki/SIGs/bigdata/packaging/Scala 14:12:23 <mattf> since epel7 ~= f18, it seems like a huge hurdle to try and do packages for 14:12:50 * willb notes agreement 14:12:59 <mattf> centos may be an interesting target in the future, since there restrictions for overlapping packages is not present 14:13:40 <tstclair> mattf, what channels? epel or centos? 14:13:43 <pmackinn> willb, any interest in play? 14:13:45 <mattf> in fact, some folks have tools & experience doing dep graph rebuilds on el, in such a way that ignores conflicting packages 14:14:06 <mattf> tstclair, hurdle for epel seems higher than centos 14:14:23 <willb> pmackinn, yes 14:14:45 <mattf> it might even be possible to do a big data copr build that sits on centos, which would be inline w/ their model. such an option isn't available for epel, afaik. 14:15:02 * tstclair thinks CentOS traction would gain a much higher audience. 14:15:06 <willb> pmackinn, I don't have strong preferences (at a tech level) between Spray, Lift, and Play, but it would be great to have all three 14:15:11 <misc> what you would want is a centos SIG 14:15:35 <misc> http://wiki.centos.org/SpecialInterestGroup 14:15:45 <pmackinn> willb, ack, see lots of buzziness around play 14:15:49 <mattf> misc, and all we wanted before was a fedora sig (box checked) 14:16:20 <mattf> tstclair, i agree the audience may be broader for a centos sig 14:16:48 <mattf> however, there's no big data ball rolling toward centos atm, just occasional whispers of it or epel 14:16:59 <tstclair> misc, thanks BTW. 14:17:01 <mattf> misc, do you know much about the centos process / ecosystem? 14:17:18 <misc> mattf: I know a bit 14:17:38 <misc> for now, the SIG in centos are a bit different than SIG in Fedora, despites having the same name 14:18:06 <mattf> how are they different? 14:18:20 <misc> anyone can start a SIG for Fedora, and for Centos, there is a process to follow 14:18:31 <misc> and requirement 14:18:37 <misc> this is a bit more codified 14:18:56 <misc> but the goal is indeed to have a layer on top of centos to explore a specific domain 14:18:58 <mattf> it's true we are kinda an ad hoc, rag tag bunch in fedora sometimes 14:19:10 <misc> ie, voip, virtualisation, storage 14:20:07 <misc> mattf: I think the easiest would be to contact someone from centos, like karanbir singh or karsten wade 14:20:44 <mattf> gotcha. we should probably also have at least half a dozen people interested in such a thing. 14:21:12 <mattf> there's probably a good way to get data on interest. right now it's all ad hoc. 14:21:17 <tstclair> Wouldn't there be contention of epel vs. CentOS? 14:21:36 <mattf> i dunno, what are you imagining? 14:21:50 <misc> conflicting package would only be a issue if you decide to have differents repo 14:22:13 <tstclair> epel broader audience. 14:22:23 <tstclair> If the goal is exposure. 14:22:46 * tstclair tries to figure out what hurdle is > ? 14:22:48 <mattf> tstclair, i wonder if there's data on that 14:23:06 <mattf> the centos folks surely have data on their audience, i wonder if epel does too 14:23:16 <tstclair> epel - RHEL, CentOS, Scientific ... 14:23:46 <mattf> epel can surely reach a broader audience, but does it? 14:23:52 <misc> well, centos is just a rebuild of RHEL 14:23:59 <misc> and you kinda use epel anyway 14:24:15 <misc> and you could have a centos sig that produce rpm for epel, there is not delimitation :) 14:24:19 <tstclair> I don't know anyone who uses native EL/CentOS w/o epel. 14:24:55 <mattf> i wonder if karanbir or karsten could tell us w/ data 14:25:12 <mattf> i'll email them and ask 14:26:46 <misc> what kind of data are you looking for ? 14:27:17 <mattf> well, ideally # people using 14:27:26 <mattf> which i'd argue is different from # installs or # downloads 14:27:35 <mattf> and # running systems 14:27:56 <misc> that's hard to get for precise number, but I think "a lot" would be a good answer :) 14:27:58 <mattf> it's tricky tho, since 1 person may manage 1000 machines that are used by 10000 people 14:28:43 <mattf> misc, tstclair, how would you guys phrase the question to get a read on epel v centos sig? 14:29:38 <misc> I think the question is "do we want to push the rpm to epel, or do we want to have a separate repository" 14:30:46 <misc> a centos SIG would be a place for centos users of what the current SIG produce 14:30:47 <mattf> hmm, because essentially a centos sig is like a copr repo on fedora -- it's a separate repo w/ a sig to maintain it? 14:31:07 <misc> a SIG is just a group 14:31:24 <misc> then they have access to some ressources, a ml, webhosting, etc, etc 14:31:29 <tstclair> I'm not sold on a CentOS sig, I think it's limiting. 14:32:10 <misc> now, there is indeed a overlap with what fedora offer and what centos offer in term of ressources, reach 14:32:41 <misc> but it doesn't hurt to ask and discuss 14:32:57 * tstclair agrees 14:33:33 <misc> the way i see, the goal is to create a community around big data, with packaging, users helping, documentation, etc 14:33:50 <misc> fedora sig do that with fedora as a base, + push to epel, 14:33:55 <mattf> misc, do i have the model approximately right: a centos sig maintains a repo that works on centos which is analogous to a copr repo for fedora? 14:33:56 <misc> centos do that with centos as a base 14:34:06 <misc> mattf: it could be, yes 14:34:18 <misc> not all sIG are around package, but those that are would go that way 14:34:41 <misc> for example, xen4centos is maintaining xen 14:34:46 <mattf> misc, as for the fedora big data sig, we've done a good job re packaging, not so much user help. documentation somewhat through blogs. 14:36:16 <misc> mattf: well, big data is still a bit new, and you need to have people that need it, and packages before having people who know how it work and write doc :) 14:36:35 <mattf> in my tech mind the epel v centos tradeoff is mostly around "can you satisfy the epel requirements" if that bar is too high you may want to go to centos 14:37:04 <misc> yeah, taht too 14:37:39 <tstclair> Should we make that an action item to follow up on? 14:37:52 <mattf> epel requirements being things like "don't duplicate packages in el", which is a very high bar (O(n^2), n = number of deps, heh) 14:38:34 <mattf> tstclair, i'd like to hear from the epel and centos folks about when to pick one or the other, and get an idea of their reach. 14:39:06 <pmackinn> copr makes the most sense, regardless of epel v. centos 14:40:47 <mattf> if that's the case, what's to stop us from doing an el copr build of packages w/ centos marketing or epel reach? 14:41:03 <misc> nothing 14:41:05 <tstclair> best of both worlds? 14:41:34 <mattf> the whole copr approach is pretty awesome, it lets your make forward progress and mark the places where there are conflicts w/ fedora/epel/centos 14:42:29 <pmackinn> copr can be affiliated to centos sig, but not its raison d'etre...seems like 14:42:35 <mattf> i for one would like to ask the centos folks about reach and tradeoffs w/ epel. does anyone have any copr experience here? 14:43:03 * tstclair nopes 14:44:46 * pmackinn fears reach data somehow informs what should be *optimal* packaging approach...shouldn't imho 14:46:02 <mattf> pmackinn, fair, esp since reach != use 14:46:25 <tstclair> I think at this point there needs to be follow up, should we table for now and leave as an action item? 14:46:52 <mattf> let's. i'll take the action to check w/ the centos folks. 14:47:08 <tstclair> Anyone want to dig into copr? 14:47:54 <tstclair> pmackinn, rsquared, eje ^? 14:48:31 <pmackinn> tstclair, i can read up on it 14:48:45 <tstclair> ack, thanks pmackinn! 14:49:00 <tstclair> Are there other items for the group? 14:49:27 <rsquared> Nothing from me 14:49:27 <mattf> none from me 14:50:37 <tstclair> ok, I'll call it. Thanks everyone! 14:50:47 <tstclair> #endmeeting