fedora-meeting
LOGS
20:00:19 <mmcgrath> #startmeeting Infrastructure
20:00:20 <zodbot> Meeting started Thu Mar 25 20:00:19 2010 UTC.  The chair is mmcgrath. Information about MeetBot at http://wiki.debian.org/MeetBot.
20:00:21 <mmcgrath> Who's here?
20:00:22 * skvidal is here
20:00:22 <zodbot> Useful Commands: #action #agreed #halp #info #idea #link #topic.
20:00:26 * lmacken 
20:00:45 * a-k is here
20:02:19 <mmcgrath> #topic Fedora 13 beta
20:02:21 <mmcgrath> hah
20:02:28 <mmcgrath> Lets get started
20:02:38 <mmcgrath> https://fedorahosted.org/fedora-infrastructure/report/9
20:03:03 <mmcgrath> .ticket 2058
20:03:04 <zodbot> mmcgrath: #2058 (Verify Mirror Space) - Fedora Infrastructure - Trac - https://fedorahosted.org/fedora-infrastructure/ticket/2058
20:03:05 <mmcgrath> I'll get this one
20:03:20 <mmcgrath> .ticket 2059
20:03:24 <zodbot> mmcgrath: #2059 (Release Day ticket) - Fedora Infrastructure - Trac - https://fedorahosted.org/fedora-infrastructure/ticket/2059
20:03:27 <mmcgrath> This one's just a tracker ticket.  I'll take it too
20:03:38 <mmcgrath> .ticket 2060
20:03:40 <zodbot> mmcgrath: #2060 (Verify releng permissions) - Fedora Infrastructure - Trac - https://fedorahosted.org/fedora-infrastructure/ticket/2060
20:03:42 <mmcgrath> smooge: want to get that one again?
20:04:52 <mmcgrath> we'll come back to that.
20:04:57 <mmcgrath> .tiny 2061
20:04:58 <zodbot> mmcgrath: Error: '2061' is not a valid url.
20:05:04 <mmcgrath> .ticket 2061
20:05:06 <mmcgrath> sorry :/
20:05:07 <zodbot> mmcgrath: #2061 (MM redirects) - Fedora Infrastructure - Trac - https://fedorahosted.org/fedora-infrastructure/ticket/2061
20:05:17 <mmcgrath> mdomsch usually gets this one (I believe it's automated now and just requires verification)
20:05:20 <mmcgrath> .ticket 2062
20:05:21 <zodbot> mmcgrath: #2062 (Infrastructure Change Freeze) - Fedora Infrastructure - Trac - https://fedorahosted.org/fedora-infrastructure/ticket/2062
20:05:24 <mmcgrath> I'll get that, we are frozen.
20:05:31 <smooge> morning
20:05:39 <smooge> sorry got stuck on phone
20:05:43 <mmcgrath> smooge: hey, want to get 2060?
20:05:49 <smooge> yes
20:05:55 <mmcgrath> k
20:05:57 <mmcgrath> and the last ticket
20:06:05 <mmcgrath> .ticket 2063 doesn't need to be done until after the launch
20:06:07 <smooge> could I get 2058 .. I find them related
20:06:08 <zodbot> mmcgrath: Error: "2063 doesn't need to be done until after the launch" is not a valid integer.
20:06:27 <smooge> hehheeh
20:06:43 <mmcgrath> zodbot: you are testing me!
20:06:51 <mmcgrath> ok, anyone have any questions or comments related to the release?
20:06:53 <mmcgrath> Oxf13: you around?
20:07:03 <Oxf13> I am
20:07:11 <mmcgrath> what are our odds of slipping at the moment?
20:07:24 <Oxf13> I'd put it at 50% chance
20:07:41 <Oxf13> There is one blocker we're worried about, but we have a patch in hand, it just needs testing, then I can make the RC
20:07:44 <skvidal> mmcgrath: so - when it's a 50% chance of rain - you carry an umbrella :)
20:07:48 <mmcgrath> cool.
20:07:50 <Oxf13> we have a compressed amount of time to test the RC
20:08:07 <Oxf13> and not really any time to fix anything that's wrong with the RC and validate a second RC before the go / no go time
20:08:09 <mmcgrath> Oxf13: I'll work with you later today or tomorrow to verify mirror space, we expecting this to be the same size(ish) as the alpha?
20:08:14 <Oxf13> yes
20:08:36 <mmcgrath> k, sounds good.
20:08:43 <mmcgrath> If no one has anything else, I'll move on?
20:08:54 * smooge remembers the days of having 8 or 9 RC's
20:09:07 <smooge> nothing else
20:09:14 <mmcgrath> #topic func updates
20:09:30 <mmcgrath> So after some coding and some testing, the func updates before the freeze went pretty well I thought.
20:09:45 <skvidal> does anyone else want to work on that project?
20:09:47 <mmcgrath> still a few kinks to work out but it was much easier then our current method and required much less attention.
20:09:57 <mmcgrath> skvidal: as in you're done with it or want some help?
20:10:03 <skvidal> want some help
20:10:21 <skvidal> I got asked to work on something else this week
20:10:22 <mmcgrath> skvidal: I have some cycles during the freeze.  though I can't promise I won't make things worse :)
20:10:30 <skvidal> and that's been taking my focus
20:11:07 <skvidal> so it's not dropped
20:11:14 <mmcgrath> skvidal: well I'm sure I'll be pinging you soon(ish)
20:11:17 <skvidal> but I won't be able to spend as much time on it until I get the mock vm stuff out
20:11:20 <mmcgrath> <nod>
20:11:25 <mmcgrath> anyone else have questions or comments on that?
20:11:41 <skvidal> the func+yum thing is lightweight
20:11:46 <skvidal> and entry-level easy to work on
20:11:58 <skvidal> http://fedorapeople.org/gitweb/skvidal/func-yum.git
20:12:02 <skvidal> lots of easy wins
20:12:12 <mmcgrath> skvidal: thanks
20:12:19 <mmcgrath> Ok, next topic
20:12:21 <mmcgrath> #topic Collectd
20:12:31 <mmcgrath> So we've been using collectd for a bit now.
20:12:35 <mmcgrath> what do people think?
20:12:39 <abadger1999> skvidal: Thanks for getting func working well again.
20:12:51 <skvidal> abadger1999: so much to do to make things 'well'
20:12:52 <smooge> skvidal, I would like to help
20:12:53 <skvidal> but it is unbroken
20:13:01 <abadger1999> :-)
20:13:02 <smooge> my python is broken, but I really want to help
20:13:26 <smooge> sorry meant help on func
20:13:33 <smooge> collectd I have found useful
20:13:46 <abadger1999> mmcgrath: It's helped us fix something already.  It's generally good.
20:13:51 <smooge> looking at app04 I can see where it is heavily running into some issues
20:13:52 <mmcgrath> smooge: me too, we've already found several problems just by having it in place.
20:14:05 <skvidal> what did collectd help you fix?
20:14:26 <mmcgrath> skvidal: it helpped us find the outage blips as being realted to db2.
20:14:31 <skvidal> ah
20:14:32 <skvidal> cool
20:15:02 <mmcgrath> other tools could have found it, but just the way we have it setup right now (every 10s) allowed us to see that load spike in such a short window was related to disk, and even more then that disk writes.
20:15:06 <mmcgrath> which got us looking.
20:15:22 <mmcgrath> and while it's not totally fixed I do think we're in better shape.  I think we just need to adjust our backup system a bit.
20:15:31 <mmcgrath> but that'll be a post-freeze thing.
20:15:35 <mmcgrath> so here's the only got'cha with collectd.
20:15:45 <skvidal> its a massive suck?
20:15:52 <mmcgrath> https://admin.fedoraproject.org/collectd/bin/index.cgi?hostname=log01&plugin=disk&timespan=604800&action=show_selection&ok_button=OK
20:15:57 <mmcgrath> heh
20:16:05 <mmcgrath> it's the disk IO required to do rrd files.
20:16:13 <mmcgrath> there's lots of tricks we can (and do) use to fix that.
20:16:18 <mmcgrath> but as we grow, it's something to watch.
20:16:48 <mmcgrath> you'll notice that on the 19th I figured out that automatically polling every tcp port in use and recording that info was too expensive for us :)
20:16:51 <mmcgrath> duh
20:16:58 <mmcgrath> but yeah, something to watch.
20:17:14 <mmcgrath> there's also non-rrdtool collection methods we can use if we really need to that would also be useful
20:17:27 <mmcgrath> anywho, anyone have any questions on that?
20:17:43 <mmcgrath> I've been slowly adding more useful stuff
20:17:47 <mmcgrath> like -
20:17:49 <mmcgrath> .tiny https://admin.fedoraproject.org/collectd/bin/index.cgi?hostname=db1&plugin=mysql&timespan=86400&action=show_selection&ok_button=OK
20:17:51 <zodbot> mmcgrath: http://tinyurl.com/ye6nrvl
20:18:21 <mmcgrath> anywho, no more questions there so that's good.
20:18:28 <mmcgrath> #topic Monitoring
20:18:30 <smooge> mmcgrath, we did rrdtools in a ram drive
20:18:47 <mmcgrath> smooge: yeah that's basically what some of the tunables in the rrdtool plugin do.
20:18:59 <mmcgrath> Ok, so we're basically back to nagios and collectd.
20:19:15 <smooge> yeah.. we set it up like /var/spool/mqueue/t and had a 2 GB partition for it ..
20:19:29 <mmcgrath> hydh has been working on some stuff but I know he's been busy
20:19:33 <smooge> I want to say our community help on nagios has been cool
20:19:36 <smooge> and great
20:19:38 <mmcgrath> I'd like to get some basic event handlers finalized as well as proper deps in place.
20:19:42 <mmcgrath> smooge: indeed
20:20:13 <mmcgrath> anyone have any questions about what we're up to in nagios and where we're headed?
20:20:38 <mmcgrath> alrighty, well with that
20:20:41 <mmcgrath> #topic Open Floor
20:20:45 <mmcgrath> anyone have anything they'd like to discuss?
20:20:50 <mmcgrath> a-k: anything new on search engines?
20:21:01 <a-k> I'm still looking at mnoGoSearch with PostgreSQL
20:21:14 <a-k> I haven't had a chance to try crawling with it yet, but if it goes well I'll put it in pub test next week
20:21:25 <a-k> BTW is there a preference for MySQL vs PostgreSQL? I know/think we have them both around....
20:21:38 <dgilmore> a-k: no preference
20:21:57 <a-k> I'm okay with that.  That's about it for now.
20:22:24 <mmcgrath> anyone have anything else they'd like to discuss/
20:22:34 <gholms> Random question?
20:22:49 <gholms> Did you folks notice the collectd server-side daemon putting a lot of load on the machine?  I ask because that's what I experienced at $dayjob.
20:23:15 <mmcgrath> gholms: yeah, we fixed it with the suggestions here...
20:23:16 * mmcgrath gets link
20:23:33 <mmcgrath> http://collectd.org/wiki/index.php/Inside_the_RRDtool_plugin
20:24:01 <gholms> Ooh, that looks useful.  Thanks.
20:24:09 <mmcgrath> gholms: yup yup
20:24:11 <smooge> 1) going to work on log reviews over freeze. Now that we have over 50% free logs I was going to see what I could get out of it daily.
20:24:36 <smooge> I think someone else was working on this earlier so will hook up with them and see where it goes
20:24:43 <mmcgrath> smooge: cool
20:24:48 <mmcgrath> yeah someone was but I don't know the status
20:25:11 <smooge> then I am pretty much building my home/slicehost network to 'clone' F-I so I can test stuff here a bit better.
20:25:37 <smooge> my goal will be to see how far I can take epylog before it screams in terror at our data
20:25:50 <mmcgrath> smooge: sounds good, let us know if you need anything
20:25:53 <mmcgrath> well me
20:25:54 <mmcgrath> :)
20:26:18 <mmcgrath> ok, well with that I'll close the meeting in 30
20:26:44 <mmcgrath> 15
20:26:56 <mmcgrath> #endmeeting