19:00:18 #startmeeting Infrastructure (2011-06-09) 19:00:18 Meeting started Thu Jun 9 19:00:18 2011 UTC. The chair is nirik. Information about MeetBot at http://wiki.debian.org/MeetBot. 19:00:18 Useful Commands: #action #agreed #halp #info #idea #link #topic. 19:00:18 #meetingname infrastructure 19:00:19 #topic Robot Roll Call 19:00:19 #chair goozbach smooge skvidal codeblock ricky nirik abadger1999 19:00:19 The meeting name has been set to 'infrastructure' 19:00:19 Current chairs: abadger1999 codeblock goozbach nirik ricky skvidal smooge 19:00:26 * ricky 19:00:26 * skvidal is here 19:00:34 * tideline here 19:00:36 * pingou around 19:00:38 * StylusEater is here 19:00:41 * CodeBlock silently wonders if nirik's lowercase 'codeblock' still makes me a chair. 19:00:46 present 19:00:48 http://goo.gl/doodle/rsSl 19:00:49 * athmane hello everyone 19:00:53 hello 19:02:10 morning all 19:02:56 hello 19:03:01 #topic New folks introductions / apprentice stuff 19:03:38 anyone want to say hi? 19:03:45 hi :) 19:03:47 any apprentice thoughts? 19:03:57 is there a current list of hosts managed by this team somewhere one the wiki 19:04:21 It's all in puppet, look at puppet/manifests/nodes 19:04:35 Hi, people I'm New Apprentice 19:04:44 ^ or nagios or virthost-list 19:04:56 virthost-lists.out* 19:05:06 welcome LoKoMurdoK 19:05:07 on which hosts? 19:05:09 hello, another one from fedora-qa 19:05:31 I am happy to belong to the team 19:05:35 tks 19:05:43 tideline: /var/log/virhost-lists.out and what ricky said are both puppet01, nagios is http://admin.fpo/nagios 19:05:56 CodeBlock: thanks 19:06:04 CodeBlock: virthost-lists doesn't cover physical hosts which are NOT xen/kvm hosts 19:06:17 truedat. 19:06:28 there's a sort of list in the mass update wiki page too... 19:06:35 tideline: /tmp/complete-minion-list on puppet1 19:06:39 is the total list 19:06:39 but ideally we would get rid of that in favor of something thats auto-generated. ;) 19:06:42 as of 30s ago 19:06:48 nirik: wilco 19:06:54 nirik: when infra.fp.o is hot 19:06:55 I've updated the wiki page about Nagios alert access 19:07:04 we should just have a location it is available in /srv/infra/ 19:07:07 that you can simple ls 19:07:16 yeah. 19:07:19 I am currently working with the disclaimer of the planet, I'm a systems administrator in Panama, and thanks for the welcome. 19:07:27 Heh my clone died/they finally killed that vps that I stopped paying for like 2 months ago? 19:07:28 LoKoMurdoK: you're Luis? 19:07:30 yes 19:07:33 Luis 19:07:58 irc: LoKoMurdoK , FAS. lbazan, 19:08:02 NOTE to apprentice folks: I am going to go thru the group later today/tonight and remove folks who didn't reply to my email on the 2nd. :) Nothing personal, and we can re-add folks as they get time to work on things.... 19:09:02 nirik: have you send the email? 19:09:08 I didn't get ine 19:09:10 on* 19:09:10 nirik: 2nd of June ? 19:09:11 tideline: I sent email on the 2nd... 19:09:21 yes. If you were added after that, no problem. 19:09:25 nirik: I'm still in said group, don't think I replyed to the email thought :) 19:09:26 I dont think I was in there on the 2nd 19:09:33 nirik: cool 19:09:58 goozbach: please do if you get a chance. ;) 19:10:05 the same here (< 1 week) 19:10:05 Subject was: "June Status update for Fedora Infrastructure Apprentices" 19:10:12 skvidal, yes I'm Luis 19:10:14 so, no worries... you will get one in July. ;) 19:10:29 ok, any more apprentice/new folks business? any questions? or shall we move on to the next topic? 19:10:39 nirik: infra list ? 19:11:01 pingou: directly to fi-apprentice-members 19:11:06 ah ok 19:11:26 LoKoMurdoK: nod - cool 19:11:28 ok, moving on. 19:11:41 #topic Upcoming Events / Tasks 19:12:11 I'm just going to info all these: 19:12:41 #info 2011-06-09 Remove inactive fi-apprentice people. 19:12:41 #info 2011-06-14 or so: post release housecleaning tasks. 19:12:42 #info 2011-06-14 Class B mass reboots 19:12:42 #info 2011-06-15 Class A mass reboots. 19:12:42 #info 2011-06-17 FPCA drop dead. 19:12:42 #info 2011-07-01 mail fi-apprentice folks. 19:12:44 #info 2011-07-01 BLOGS closing time. 19:12:46 #info 2011-07-11 - 14: smooge and nirik at phoenix 19:13:06 Does anyone have questions on those or other upcoming items we should plan for? 19:13:12 2010-07-11-14: Seth On PTO 19:13:13 :) 19:13:29 actually.... 19:13:34 nice. ;) 19:13:36 is that going to be a problem? 19:13:43 I don't think so off hand. 19:13:49 on the plus side y'all will be RIGHT NEXT to most of the servers 19:13:55 on the minus side you might not be on hand on irc, etc 19:14:00 yeah. 19:14:07 I should be around. 19:14:10 * ricky will be around 19:14:12 CodeBlock: cool. 19:14:12 as long as some other main folks are around I think we will be ok 19:14:16 good 19:14:19 b/c it's eunice's b-day 19:14:20 well then skvidal you are allowed to go :) 19:14:25 and I don't htink I can get out of it :) 19:14:29 not w/o a lot of pain 19:14:29 :) 19:14:33 2010? or 2011 19:14:39 haha 19:14:54 smooge: He's going to go back in time, take PTO, and then rejoin us :) 19:14:58 one year past already 19:15:03 :) 19:15:05 ah man that is bad 19:15:33 wow 19:15:33 yah 19:15:34 2011 19:15:37 ^^ 19:15:37 19:15:52 well have to head out now bbl 19:15:58 have fun 19:16:00 have fun smooge 19:16:00 Enjoy :-) 19:16:04 #topic Meeting tagged tickets: 19:16:05 https://fedorahosted.org/fedora-infrastructure/query?status=new&status=assigned&status=reopened&group=milestone&keywords=~Meeting&order=priority 19:16:18 any meeting tagged tickets folks would like to talk about? 19:16:33 .ticket 2517 19:16:34 ricky: #2517 (Need mod_evasive for EL6) - Fedora Infrastructure - Trac - https://fedorahosted.org/fedora-infrastructure/ticket/2517 19:16:40 can we just close that, since we've been doing fine without? 19:16:41 One thing on FPCA -- skvidal, wanna send out another nag mail? 19:16:54 abadger1999: do we want to send out another one? 19:16:59 * skvidal looks for spot 19:16:59 ricky: fine with me. 19:17:01 I'm fine with it 19:17:14 We're much improved now but still over a thousand packages will be orphaned. 19:17:16 * skvidal was not replying to mod_evasive 19:17:19 * skvidal was replying to abadger1999 19:17:22 sorry 19:17:37 abadger1999: wow, sounds like a win 19:17:37 :) 19:17:43 http://toshio.fedorapeople.org/fpca3/ 19:17:46 skvidal: haha 19:17:46 abadger1999: b/c those 1K pkgs are obviously not maintained :) 19:18:07 it looks like about 2/3rds of people are signed now. (fedorapeople accessing folks that is) 19:18:07 True dat 19:18:38 athmane: you wanted to talk about ticket 308 some? 19:18:46 nirik, yes 19:18:51 #action send out another fpca nag... can do out of band 19:19:58 abadger1999: it's http://toshio.fedorapeople.org/fpca3/union_important_users.txt right? 19:20:05 skvidal: Correct 19:20:10 abadger1999: I can send out the same email I did the other day, if spot is okay w/ir 19:20:13 s/ir/it/ 19:20:26 skvidal: Perfect. Thanks 19:20:34 athmane: ok, so you have a proposed policy, etc... 19:20:44 abadger1999: I'll email spot and cc you 19:20:53 yes, tested with Drupal but not yet with MW 19:21:12 also when i looked into puppet 19:21:15 athmane: yeah, we will want to possibly next roll out to stg and test all the apps? 19:21:39 each httpd has mod_sec but disabled (rules commented) 19:21:47 abadger1999: I just want spots okay b/c it has his name on it :0 19:22:04 Yeah, we actually ran into issues with mod_security the last time we enabled it 19:22:18 Which is why we want to be really careful with the rules this time - start them very permisive 19:22:38 nirik, the default rules-set are PITA 19:22:43 Will mod_security live on the app or proxy servers? 19:23:02 :) 19:23:19 * skvidal smiles at nothing in particular (sorry) 19:23:19 proxies are haproxy based ? 19:23:34 skvidal: No need to be sorry for being happy :-) 19:23:43 yeah, I would like to see a cautious approach... test in stg and see everything looking ok, then deploy on ONE proxy/app and test live for a while before rolling to the others. 19:23:48 athmane: The proxies run apache with varnish and haproxy in front of them (yeah, it's a little copmlex) 19:24:31 * ricky will say that he personally is not a fan of stuff like mod_security, but thanks for working on it nevertheless :-) 19:24:55 ricky, me too :) 19:25:22 what problem is mod_security solving for us currently? 19:25:22 ricky: this probably isn't the forum but I'd like to hear why ... maybe in admin later? 19:25:35 s/currently/in the future/ 19:25:49 maybe 0-days in some apps ? 19:25:52 StylusEater: Sure, although it seems like skvidal is kind of getting at it too 19:26:07 we don't need to discuss it here 19:26:07 sorryt 19:26:21 * nirik hasn't used mod_security too much... 19:26:37 yeah, looks like it could help us with new attacks/0-days/things our apps are not yet hotfixed for. 19:26:39 As a blacklisting-type thing, there are always known ways to bypass it floating around - it's not even too hard when you can see the rules. At the same time, false positives create pain for admins 19:27:11 So that's why I'm not crazy about it - I try to fix problems at the core (hotfixing) when I see them. 19:27:38 Does mod_security have a logging-only configuration? 19:27:41 zodbot: .whoowns mod_security 19:27:42 That I wouldn't mind seeing. 19:28:04 ricky: can't we protect the list by making it "private" ? maybe that flies in the face of what we do, but ... 19:28:06 (Definitely as a required first step if we plan on deploying it further, I think) 19:28:09 ricky, yes you can make default action to log 19:28:39 hmm 19:28:52 so mod_security is one of the pkgs impacted if its owner doesn't sign the fpca soon 19:29:09 StylusEater: Eh, I'm not crazy about the security of it depending on the configs being private - attackers have all the time in the world to try permutations to get past the rules. 19:29:36 I wouldn't mind seeing mod_security + a plan where we actually monitor the logs from it 19:30:12 We currently don't do a great job of monitoring and addressing problems in logs... as a result, some of our logs are like, 50% tracebacks (just made that number up) 19:30:25 Which is largely my fault with FAS's case, but just saying :-) 19:30:41 so 19:30:49 we've talked about this at fudcon and in here 19:31:07 but ironing out these issues are all pieces of the whole problem 19:31:52 * athmane notes that mod_sec logs are huge 19:32:15 do we need mod_security immediately? is there a known threat? 19:32:26 yeah, handling logging would be nice. 19:32:41 skvidal: no, other than it's something athmane knows and was willing to work on. 19:32:58 perhaps we could refocus on log manageing/reporting? 19:33:01 not sure but you should ask mmcgrath (he opened that ticket) 19:33:08 okay - my concern is that it sounds like mod_security is going to need a lot more effort to do properly 19:33:19 and a bunch of other infrastructure needs to be better prepared for it 19:33:28 does that sound about right? 19:33:32 skvidal: yes 19:33:49 yeah. 19:33:59 Yup, log processing being the biggest one 19:34:16 right 19:34:28 ricky: we currently point rsyslog to a db? 19:34:37 it's not at a db, is it? 19:34:40 Nope, rsyslog to flat files 19:34:40 it's to disk 19:34:48 ricky: well a hierarchy of them 19:34:53 not all glommed together. 19:35:02 s/not all/not only all/ 19:35:06 Oops, yeah, misused the word "flat" there. 19:35:33 other idea is to use a local log scanner, like http://code.google.com/p/apache-scalp/ 19:35:42 and php-ids defs 19:35:56 so it feels a bit like we're off in the weeds here 19:36:04 yeah, how about this: 19:36:20 - lets defer mod_security for now until we have more handle on log management. 19:36:32 nirik: good idea 19:36:37 works for me 19:36:41 - lets file a ticket/see if athmane or others are willing to work on log management/reporting. ;) 19:36:54 possibly working on epylog, or other similar things. 19:37:14 i do log maintenance and reporting as part of my $dayjob 19:37:21 marchant: what do you use? 19:37:23 but env is very different 19:37:25 athmane: thoughts? 19:37:29 splunk 19:37:39 marchant: cool. :) 19:37:39 yah - that's not gonna fly 19:37:43 marchant, the same here 19:37:45 right 19:38:01 it is a great tool, but obviously not for fpo 19:38:08 the concept is great 19:38:12 I can do regulat checks etc (httpd logs, clamav, rkhunter etc..) 19:38:18 **regular 19:38:26 and it puts all logs into a central db for foresics and reporting 19:38:34 understood 19:38:38 it's not free software 19:38:41 so running it on our boxes 19:38:44 is a non-starter 19:38:46 no it is very expensive 19:38:50 right. we need a free solution. ;) 19:39:02 i definitely understand that 19:39:06 so - epylog is the only semi-maintained solution that I know of that's not full of pain and agony 19:39:12 s/full/completely full/ 19:39:22 so, would you guys be willing to work on this? we can discuss details out of meeting? 19:39:31 mod_sec log analyzer tool also not foss 19:39:38 i am definitely willing 19:40:13 i am still learning 19:40:14 nirik, ok 19:40:22 skvidal: can you file a ticket (although, do we already have one?) with epylog info, etc? 19:40:32 I'll look if we have one 19:40:35 other wise, yes 19:40:57 thanks. 19:41:06 ok, any other meeting tickets? 19:41:13 (that we want to discuss?) 19:41:40 can i ask what will probably be mundate questions about the easyfix tickets in fedora-noc during the day? 19:41:52 absolutely... or fedora-admin... 19:41:54 Yes, feel free :-) 19:42:00 #topic Open Floor 19:42:05 anything for open floor? 19:42:09 i hate to get in the way of stuff that you are working on during the day 19:42:24 wasn't sure if that was a good platform 19:42:28 marchant: no problem... if people are busy they likely just won't answer. ;) 19:42:36 is there an apprentice irc 19:42:37 and then can answer when they get time. 19:43:02 marchant: ask in #fedora-admin if you have things 19:43:10 I thin kit is entirely welcome and appropriate 19:43:17 oh, for main folks... if anyone wants to help with class "C" host updates, please see me... it would be good to get those done as time permits before the other ones next week. 19:43:32 * nirik seconds skvidal's response. 19:43:33 nirik: puppet changes/moving things around - smooge seems down w/it - are you okay w/me moving servergroups into services? 19:43:44 yep. Make it so 19:43:45 nirik: class C .. low impact hosts ... no? 19:44:07 StylusEater: right. low impact or some measure of HA, so no end user impact 19:45:04 ok, lets continue over in #fedora-admin or #fedora-noc... thanks for coming everyone! 19:45:13 #endmeeting