bigdatasig
LOGS
14:00:17 <tstclair> #startmeeting
14:00:17 <zodbot> Meeting started Thu May  8 14:00:17 2014 UTC.  The chair is tstclair. Information about MeetBot at http://wiki.debian.org/MeetBot.
14:00:17 <zodbot> Useful Commands: #action #agreed #halp #info #idea #link #topic.
14:00:31 <tstclair> #meetingname BIGDATASIG
14:00:31 <zodbot> The meeting name has been set to 'bigdatasig'
14:00:41 <tstclair> #meetingtopic status
14:01:10 <tstclair> Morning folks, want to do a quick status update?
14:01:55 <tstclair> mattf, rsquared, willb__, pmackinn, et. all
14:02:00 * pmackinn ambari 1.5.1 in f20, rawhide; number80 working on my sqoop review
14:02:09 * willb__ is hoping to have draft Scala packaging guidelines by the next time we meet
14:02:47 <tstclair> willb, nice.
14:02:49 <rsquared> Hadoop 2.4.0 is getting close.  2 obstacles remain: Web portals (closing in on a fix), and leveldbjni-all is broken.  The later affects the new timeline server.  I might be able to get around it, but I won't hold things up.
14:03:24 <tstclair> Updating glusterfs-hadoop & tachyon blocked on Hadoop 2.4 transition, no hurry.
14:04:47 <tstclair> Hoping to cleanup and get the rest of my patches for Mesos upstream so we pull direct from upstream for 0.19.0 release.
14:05:06 <willb> awesome
14:06:29 <tstclair> Anything else noteworthy that we should be tracking?
14:06:30 <mattf> i'm going to be in atlanta for openstack summit next week, talking a lot about big data & ...openstack
14:07:32 <mattf> christopher tubbs is packaging accumulo for f21 and has added it to the sig packaging page
14:08:54 <mattf> also, i'll toot his horn for him, willb will be speaking at spark summit about his cycling app that uses fedora & spark
14:09:18 <willb> thanks mattf
14:09:24 <mattf> s'all i got atm
14:09:36 <tstclair> congrats again willb!
14:10:20 <tstclair> willb, pmackinn, rsquared, mattf - Any other group topics ?
14:10:31 <rsquared> Nothing from me
14:10:35 <pmackinn> nope
14:11:05 <willb> as usual, anyone looking for stuff to package should check out the wishlist on the Scala page :-)
14:11:18 <mattf> the topic of epel and centos seems to come up from time to time
14:11:25 <tstclair> willb, is there a link on: https://fedoraproject.org/wiki/SIGs/bigdata/packaging ?
14:11:56 <willb> tstclair, yes, but the list is at the bottom of this page:  https://fedoraproject.org/wiki/SIGs/bigdata/packaging/Scala
14:12:23 <mattf> since epel7 ~= f18, it seems like a huge hurdle to try and do packages for
14:12:50 * willb notes agreement
14:12:59 <mattf> centos may be an interesting target in the future, since there restrictions for overlapping packages is not present
14:13:40 <tstclair> mattf, what channels?  epel or centos?
14:13:43 <pmackinn> willb, any interest in play?
14:13:45 <mattf> in fact, some folks have tools & experience doing dep graph rebuilds on el, in such a way that ignores conflicting packages
14:14:06 <mattf> tstclair, hurdle for epel seems higher than centos
14:14:23 <willb> pmackinn, yes
14:14:45 <mattf> it might even be possible to do a big data copr build that sits on centos, which would be inline w/ their model. such an option isn't available for epel, afaik.
14:15:02 * tstclair thinks CentOS traction would gain a much higher audience.
14:15:06 <willb> pmackinn, I don't have strong preferences (at a tech level) between Spray, Lift, and Play, but it would be great to have all three
14:15:11 <misc> what you would want is a centos SIG
14:15:35 <misc> http://wiki.centos.org/SpecialInterestGroup
14:15:45 <pmackinn> willb, ack, see lots of buzziness around play
14:15:49 <mattf> misc, and all we wanted before was a fedora sig (box checked)
14:16:20 <mattf> tstclair, i agree the audience may be broader for a centos sig
14:16:48 <mattf> however, there's no big data ball rolling toward centos atm, just occasional whispers of it or epel
14:16:59 <tstclair> misc, thanks BTW.
14:17:01 <mattf> misc, do you know much about the centos process / ecosystem?
14:17:18 <misc> mattf: I know a bit
14:17:38 <misc> for now, the SIG in centos are a bit different than SIG in Fedora, despites having the same name
14:18:06 <mattf> how are they different?
14:18:20 <misc> anyone can start a SIG for Fedora, and for Centos, there is a process to follow
14:18:31 <misc> and requirement
14:18:37 <misc> this is a bit more codified
14:18:56 <misc> but the goal is indeed to have a layer on top of centos to explore a specific domain
14:18:58 <mattf> it's true we are kinda an ad hoc, rag tag bunch in fedora sometimes
14:19:10 <misc> ie, voip, virtualisation, storage
14:20:07 <misc> mattf: I think the easiest would be to contact someone from centos, like karanbir singh or karsten wade
14:20:44 <mattf> gotcha. we should probably also have at least half a dozen people interested in such a thing.
14:21:12 <mattf> there's probably a good way to get data on interest. right now it's all ad hoc.
14:21:17 <tstclair> Wouldn't there be contention of epel vs. CentOS?
14:21:36 <mattf> i dunno, what are you imagining?
14:21:50 <misc> conflicting package would only be a issue if you decide to have differents repo
14:22:13 <tstclair> epel broader audience.
14:22:23 <tstclair> If the goal is exposure.
14:22:46 * tstclair tries to figure out what hurdle is > ?
14:22:48 <mattf> tstclair, i wonder if there's data on that
14:23:06 <mattf> the centos folks surely have data on their audience, i wonder if epel does too
14:23:16 <tstclair> epel - RHEL, CentOS, Scientific ...
14:23:46 <mattf> epel can surely reach a broader audience, but does it?
14:23:52 <misc> well, centos is just a rebuild of RHEL
14:23:59 <misc> and you kinda use epel anyway
14:24:15 <misc> and you could have a centos sig that produce rpm for epel, there is not delimitation :)
14:24:19 <tstclair> I don't know anyone who uses native EL/CentOS w/o epel.
14:24:55 <mattf> i wonder if karanbir or karsten could tell us w/ data
14:25:12 <mattf> i'll email them and ask
14:26:46 <misc> what kind of data are you looking for ?
14:27:17 <mattf> well, ideally # people using
14:27:26 <mattf> which i'd argue is different from # installs or # downloads
14:27:35 <mattf> and # running systems
14:27:56 <misc> that's hard to get for precise number, but I think "a lot" would be a good answer :)
14:27:58 <mattf> it's tricky tho, since 1 person may manage 1000 machines that are used by 10000 people
14:28:43 <mattf> misc, tstclair, how would you guys phrase the question to get a read on epel v centos sig?
14:29:38 <misc> I think the question is "do we want to push the rpm to epel, or do we want to have a separate repository"
14:30:46 <misc> a centos SIG would be a place for centos users of what the current SIG produce
14:30:47 <mattf> hmm, because essentially a centos sig is like a copr repo on fedora -- it's a separate repo w/ a sig to maintain it?
14:31:07 <misc> a SIG is just a group
14:31:24 <misc> then they have access to some ressources, a ml, webhosting, etc, etc
14:31:29 <tstclair> I'm not sold on a CentOS sig, I think it's limiting.
14:32:10 <misc> now, there is indeed a overlap with what fedora offer and what centos offer in term of ressources, reach
14:32:41 <misc> but it doesn't hurt to ask and discuss
14:32:57 * tstclair agrees
14:33:33 <misc> the way i see, the goal is to create a community around big data, with packaging, users helping, documentation, etc
14:33:50 <misc> fedora sig do that with fedora as a base, + push to epel,
14:33:55 <mattf> misc, do i have the model approximately right: a centos sig maintains a repo that works on centos which is analogous to a copr repo for fedora?
14:33:56 <misc> centos do that with centos as a base
14:34:06 <misc> mattf: it could be, yes
14:34:18 <misc> not all sIG are around package, but those that are would go that way
14:34:41 <misc> for example, xen4centos is maintaining xen
14:34:46 <mattf> misc, as for the fedora big data sig, we've done a good job re packaging, not so much user help. documentation somewhat through blogs.
14:36:16 <misc> mattf: well, big data is still a bit new, and you need to have people that need it, and packages before having people who know how it work and write doc :)
14:36:35 <mattf> in my tech mind the epel v centos tradeoff is mostly around "can you satisfy the epel requirements" if that bar is too high you may want to go to centos
14:37:04 <misc> yeah, taht too
14:37:39 <tstclair> Should we make that an action item to follow up on?
14:37:52 <mattf> epel requirements being things like "don't duplicate packages in el", which is a very high bar (O(n^2), n = number of deps, heh)
14:38:34 <mattf> tstclair, i'd like to hear from the epel and centos folks about when to pick one or the other, and get an idea of their reach.
14:39:06 <pmackinn> copr makes the most sense, regardless of epel v. centos
14:40:47 <mattf> if that's the case, what's to stop us from doing an el copr build of packages w/ centos marketing or epel reach?
14:41:03 <misc> nothing
14:41:05 <tstclair> best of both worlds?
14:41:34 <mattf> the whole copr approach is pretty awesome, it lets your make forward progress and mark the places where there are conflicts w/ fedora/epel/centos
14:42:29 <pmackinn> copr can be affiliated to centos sig, but not its raison d'etre...seems like
14:42:35 <mattf> i for one would like to ask the centos folks about reach and tradeoffs w/ epel. does anyone have any copr experience here?
14:43:03 * tstclair nopes
14:44:46 * pmackinn fears reach data somehow informs what should be *optimal* packaging approach...shouldn't imho
14:46:02 <mattf> pmackinn, fair, esp since reach != use
14:46:25 <tstclair> I think at this point there needs to be follow up, should we table for now and leave as an action item?
14:46:52 <mattf> let's. i'll take the action to check w/ the centos folks.
14:47:08 <tstclair> Anyone want to dig into copr?
14:47:54 <tstclair> pmackinn, rsquared, eje ^?
14:48:31 <pmackinn> tstclair, i can read up on it
14:48:45 <tstclair> ack, thanks pmackinn!
14:49:00 <tstclair> Are there other items for the group?
14:49:27 <rsquared> Nothing from me
14:49:27 <mattf> none from me
14:50:37 <tstclair> ok, I'll call it.  Thanks everyone!
14:50:47 <tstclair> #endmeeting