rdo_meeting_(2016-06-01)
LOGS
15:01:00 <amoralej> #startmeeting RDO meeting (2016-06-01)
15:01:00 <zodbot> Meeting started Wed Jun  1 15:01:00 2016 UTC.  The chair is amoralej. Information about MeetBot at http://wiki.debian.org/MeetBot.
15:01:00 <zodbot> Useful Commands: #action #agreed #halp #info #idea #link #topic.
15:01:00 <zodbot> The meeting name has been set to 'rdo_meeting_(2016-06-01)'
15:01:12 <amoralej> #topic roll call
15:01:12 <openstack> Meeting started Wed Jun  1 15:01:00 2016 UTC and is due to finish in 60 minutes.  The chair is amoralej. Information about MeetBot at http://wiki.debian.org/MeetBot.
15:01:13 <openstack> Useful Commands: #action #agreed #help #info #idea #link #topic #startvote.
15:01:15 <dmsimard> \i
15:01:16 <openstack> The meeting name has been set to 'rdo_meeting__2016_06_01_'
15:01:25 <imcsk8> o/
15:01:26 <jpena> o/
15:01:37 <number80> \o
15:01:45 <trown> o/
15:01:45 <amoralej> #chair cmsimard imcsk8 jpena number80
15:01:45 <zodbot> Current chairs: amoralej cmsimard imcsk8 jpena number80
15:01:59 <apevec> dmsimard, you'll have to document how to handle openstack bot during meeting :)
15:02:11 <apevec> but really, infra fix is still not merged?!
15:02:11 <amoralej> #chair apevec
15:02:11 <zodbot> Current chairs: amoralej apevec cmsimard imcsk8 jpena number80
15:02:12 <dmsimard> hopefully they just fix it
15:02:19 <number80> #chair dmsimard
15:02:19 <zodbot> Current chairs: amoralej apevec cmsimard dmsimard imcsk8 jpena number80
15:02:19 <dmsimard> it's been outstanding since like january
15:02:25 <apevec> wow
15:02:30 <number80> #unchair cmsimard
15:02:30 <zodbot> Current chairs: amoralej apevec dmsimard imcsk8 jpena number80
15:02:33 <amoralej> thanks number80
15:02:46 <amoralej> let's start with the recurring ones
15:02:50 <dmsimard> apevec: https://review.openstack.org/#/q/topic:feature/Offical-channels
15:03:02 <amoralej> #topic DLRN instance migration to ci.centos infra
15:03:13 <amoralej> any update about promotion pipeline?
15:03:29 <apevec> trown, dmsimard ^what's left?
15:03:41 <dmsimard> we need tripleo to promote it's symlink
15:03:46 <dmsimard> then tripleo can fully move to buildlogs
15:03:48 <trown> tripleo side should be good, just need a promote
15:03:52 <trown> working on that
15:03:58 <dmsimard> KB is hard to get a hold of, he's been really busy I think
15:04:06 <dmsimard> so the repositories that were meant to be deleted have not yet been deleted
15:04:10 <dmsimard> but he told me he'd do it ..
15:04:33 <dmsimard> on that topic, I opened https://github.com/redhat-openstack/rdo-release/issues/5
15:04:58 <apevec> dmsimard, ah thanks for pointing that, I missed it
15:05:00 <chandankumar> \o/
15:05:07 <dmsimard> hum, what else
15:05:18 <amoralej> #chair chandankumar
15:05:18 <zodbot> Current chairs: amoralej apevec chandankumar dmsimard imcsk8 jpena number80
15:05:38 <dmsimard> trown: I guess the promote jobs will be 100% seemless transferred over when just changing trunk-primary dns ?
15:05:47 <dmsimard> as in, they will use *and* promote from internal dlrn ?
15:06:07 <jruzicka> o/
15:06:20 <amoralej> #chair jruzicka
15:06:20 <zodbot> Current chairs: amoralej apevec chandankumar dmsimard imcsk8 jpena jruzicka number80
15:06:20 <trown> I am confused by what that means :)
15:06:49 <dmsimard> trown: Right now the promote jobs use trunk-primary.rdoproject.org for the promotion and trunk.rdoproject.org for the dlrn repositories
15:07:02 <trown> yep
15:07:09 <dmsimard> So right now both point to the public rcip intasnce
15:07:33 <dmsimard> When we switch trunk-primary to internal dlrn, it'll promote there instead
15:07:41 <dmsimard> and I guess trunk.rdoproject.org will point to the passive dlrn instance ?
15:07:49 <jpena> dmsimard: exactly that
15:08:14 <dmsimard> we probably need to adjust puppet-dlrn with the symlink redirection thing
15:08:23 <dmsimard> we changed our mind around what we hosted on buildlogs
15:08:24 <trown> so we need to update the consuming urls to trunk-primary?
15:08:45 <dmsimard> trown: using trunk-primary would be faster and cheaper
15:08:48 <dmsimard> for consuming
15:09:00 <dmsimard> cheaper because we pay money for the passive instance :)
15:09:22 <amoralej> dmsimard, about symlinks, we'll redirect current-passed-ci to buildlogs -tested
15:09:26 <amoralej> is that what you mean?
15:09:29 <trown> k... what is the point of the passive instance then?
15:09:34 <dmsimard> trown: tripleo
15:09:39 <dmsimard> or no wait
15:09:41 <dmsimard> puppet-openstack
15:09:44 <dmsimard> and people like that
15:09:53 <dmsimard> that want to pin on hashes or use consistent
15:09:57 <jpena> and everyone using the trunk packages from outside the ci.centos infra
15:10:15 <dmsimard> jpena: tbh I want to push people to use the tested buildlogs repo as much as possible
15:10:15 <jpena> ...and who do not use passed-ci
15:10:20 <dmsimard> yeah
15:10:32 <trown> so trunk-primary is only accessible from inside centos infra?
15:10:36 <dmsimard> trown: right
15:10:47 <trown> that means we cant switch to it for promote either
15:11:01 <trown> the images produced in CI need to work outside of CI
15:11:01 <dmsimard> not sure I understand
15:11:14 <dmsimard> oh, the repository is configured in the image
15:11:17 <dmsimard> we already had this conversation
15:11:34 * dmsimard tries to remember
15:11:34 <jpena> trown: not really, we want the passed-ci repo to be synced to buildlogs, so it needs to be in primary
15:11:48 <dmsimard> jpena: the trunk repository is configured in the image
15:11:51 <trown> ya, I get confused by this whole thing everytime I try to wrap my head around it
15:12:08 <dmsimard> if the trunk repository trunk-primary.rdoproject is configured in the image, it won't work
15:12:29 <dmsimard> well, it won't work if you try to run it on your laptop or something
15:12:41 <apevec> so we could edit image before publishing and change .repo inside?
15:13:04 <apevec> virt-customize or something
15:13:06 <dmsimard> either that or just /etc/hosts, or something, I don't know, my brain isn't coming up with anything that doesn't feel hacky right now
15:13:17 <trown> apevec: that is pretty gross, hard to ensure we have same content that passed ci
15:13:44 <dmsimard> I guess there's nothing preventing the promote jobs from consuming trunk.rdoproject which is the passive instance and then promote on internal instance
15:13:57 <jpena> dmsimard: that sounds good
15:13:58 <dmsimard> it just won't be as efficient/fast
15:14:00 <amoralej> yes, every repo is synced from trunk-primary to trunk
15:14:27 <trown> what is less efficient?
15:14:41 <dmsimard> trown: pulling packages from outside of ci.centos infra
15:14:47 <dmsimard> and more expensive
15:15:14 <dmsimard> the internal dlrn server is quite literally next door, probably in the same rack alley
15:15:18 <trown> well if it is too expensive, it is not a great solution for the public either :)
15:15:33 <amoralej> there is not internal dns server in ci.centos?
15:15:40 <dmsimard> jpena: do we actually pay for bandwidth ? I assumed so
15:15:55 <jpena> dmsimard: not that I'm aware of, just the VM itself
15:16:03 <dmsimard> amoralej: I don't want to get involved in having rdoproject.org authoritary on the internal dns server of ci.centos.org
15:16:12 <dmsimard> jpena: oh okay then, not an issue
15:16:31 <rdogerrit> Merged openstack/neutron-distgit: ovs-agent requires openvswitch service  http://review.rdoproject.org/r/1253
15:16:49 <apevec> ok, what about cheating w/ /etc/hosts ?
15:16:56 <dmsimard> apevec: it's a non-issue
15:17:03 <dmsimard> apevec: we're not paying bandwidth
15:17:05 <apevec> put trunk.rdoproject.org with internal IP
15:17:15 <dmsimard> let's just leave things clean and use passive instance
15:17:17 <apevec> yes, but there's also speed
15:17:29 <dmsimard> we can address it if it's too much of a problem
15:17:34 <dmsimard> we're pulling packages from paris right now
15:17:46 <dmsimard> the dlrn passive instance is somewhere in north america
15:18:10 <dmsimard> and also is not loaded at all
15:18:14 <dmsimard> so it should already be an improvement
15:18:44 <amoralej> so do we have an agreement?
15:19:41 <amoralej> the action here is still for trown, right?
15:20:09 <trown> I can take an action to babysit tripleo promote
15:20:19 <trown> that is kind of a long standing action :)
15:20:41 <trown> I think that is all that is missing from tripleo perspective
15:21:00 <amoralej> but, in regards to migration, is it ready for the promotion in internal dlrn server?
15:21:20 <trown> #action trown babysit tripleo promote and make sure repo gets promoted correctly on internal dlrn
15:21:22 <dmsimard> Before the cutover, I can manually symlink the tested repos on the internal dlrn server
15:21:24 <trown> amoralej: ya
15:21:54 <dmsimard> Test day is soon, do we want to do it before that ? I think it's fine
15:22:32 <trown> I would say if we get it sorted this week that is fine, but otherwise we should wait until after test day
15:22:49 <amoralej> we need to send a mail to the users
15:22:58 <amoralej> with at least one week, i'd say
15:23:08 <amoralej> jpena has a notification mail ready
15:23:12 <amoralej> iirc
15:23:18 <jpena> amoralej: yes
15:23:20 <dmsimard> amoralej: it's mostly transparent from my understand though, right ?
15:23:23 <trown> k, lets go for after test day
15:23:45 <jpena> dmsimard: yes, mostly transparent. We still want to drive users to buildlogs, so it's good to communicate that
15:24:05 <dmsimard> right
15:24:30 <amoralej> so, we need to send the mail, publish the new version of the release-rpms with buildlogs
15:24:36 <amoralej> and migrate
15:25:05 <amoralej> the script to sync to buildlogs is active, right?, it'll pick up whenever we promote the links
15:25:37 <amoralej> can we put a date, then?
15:25:46 <dmsimard> amoralej: I will confirm with KB, I'll send an e-mail.
15:25:51 <amoralej> ok
15:26:16 <amoralej> #action dmsimard will confirm with kb status of buildlogs repos
15:26:23 <amoralej> so
15:26:26 <amoralej> next topic
15:26:35 <amoralej> #topic Increase timeout for Packstack CI jobs
15:26:43 <apevec> do we still see those?
15:27:06 <amoralej> i was investigating in the two we've had last week
15:27:07 <apevec> it was not reproducible outside upstream ci right?
15:27:20 <amoralej> no, i'm pretty sure it's related to slow infra in ci
15:27:44 <amoralej> in both cases are in rax infra, and a task which usually takes 3 minutes or so
15:27:55 <amoralej> whic is copying /opt to an ephemeral disk
15:28:04 <amoralej> in those two took 6 minutes
15:28:37 <amoralej> that doesn't involve network, or openstack, it's pure instance performance and it took double time
15:28:51 <apevec> ok, to conclude we're not changing timeout for now
15:29:01 <apevec> and will be monitoring
15:29:03 <amoralej> ok
15:29:23 <apevec> it could be included in rdo alerts?
15:30:00 <amoralej> i was playing with logstash.o.o to identify easier the jobs failed with timeout
15:30:35 <amoralej> but not sure how to add it to alerts, to be honest
15:31:06 <apevec> amoralej, you can take that offline w/ dmsimard
15:31:28 <amoralej> #action amoralej investigate about sending info to rdo alerts about slow gate jobs
15:31:38 <amoralej> #topic sync maintainers from rdoinfo to review.rdoproject.org (add people listed as maintainers in core)
15:31:56 <number80> Yeah, people listed in rdoinfo have no +2
15:32:15 <apevec> number80, do you have a list?
15:32:28 <apevec> we could fix few manually, but we really need sync script
15:32:39 <number80> apevec: I wrote one before going in PTO
15:32:56 <number80> https://gist.github.com/hguemar/4550930637f9163b2748e650b47e48c9
15:33:18 <number80> I add them in core-groups of each projects
15:34:05 <apevec> number80, ah cool, please propose that script in rdo_gating_scripts.git
15:34:21 <apevec> although I'm not sure that repo is correctly named :)
15:34:22 <fbo> we need to be sure emails from rdoinfo matches the email registered by user on review.rdo
15:34:35 <number80> apevec: ack, I'll submit review
15:34:57 <number80> #action number80 to submit sync rdoinfo maintainer script in rdo_gating_scripts.git
15:35:00 <apevec> fbo, ah good point, so we need to have them login first
15:35:03 <EmilienM> apevec: I don't know if rabbitmq or erlang upgraded but we have a new AVC in puppet CI: http://logs.openstack.org/46/323546/2/check/gate-puppet-openstack-integration-3-scenario002-tempest-centos-7/28a4f06/console.html#_2016-06-01_14_24_47_770
15:35:21 <number80> fbo: script does that checking
15:35:29 <number80> https://gist.github.com/hguemar/4550930637f9163b2748e650b47e48c9#file-sync_rdoinfo_maintainers-sh-L20
15:35:31 <EmilienM> rhallisey: fyi ^
15:35:32 <apevec> EmilienM, it was  upgraded
15:35:37 <fbo> number80, alright
15:35:46 <EmilienM> apevec: you have changelog handy?
15:36:05 <apevec> EmilienM, our CI job didn't hit it, but maybe it was permissive...
15:36:26 <EmilienM> apevec: our CI is permissive, but we catch AVC and fail if one is detected
15:36:28 <apevec> EmilienM, let's take it after meeting
15:36:34 <EmilienM> oops sorry
15:36:36 <slagle> did a new rabbitmq-server just get pushed out?
15:36:55 <EmilienM> slagle: see my last messages ^
15:37:00 <apevec> slagle, yes, but let's discuss after meeting
15:37:02 <slagle> oh :)
15:37:07 <amoralej> well, i think we can move on to the next topic
15:37:13 <apevec> slagle, it's sync w/ what's pushed to OSP9/10
15:37:30 <trown> ...
15:37:32 <apevec> (and should've been done in RDO Mitaka first... but...)
15:37:40 <trown> why wasnt it
15:37:40 <slagle> EmilienM: ok, all of tripleo-ci is failing with a different error
15:37:51 <trown> we can talk after meeting, but I think this broke tripleo
15:37:55 <slagle> we don't run with selinux enforcing anyway
15:38:05 <amoralej> #topic RHOSP/third-party repositories statuses in RDO documentation (Cf. https://github.com/redhat-openstack/website/pull/589)
15:38:25 <number80> new PR about a doc using a third-party repo
15:38:42 <number80> well, this is not the first one but we need to have policy about it
15:38:59 <number80> proposal: have a standard messages indicating that howto requires a third-party repositories that is not supported by RDO project
15:39:51 <apevec> +1
15:39:59 <amoralej> i agree too
15:40:09 <amoralej> is that something that must be managed o
15:40:14 <amoralej> on each doc?
15:40:29 <amoralej> or there is kind of templating to add notes in docs?
15:41:03 <apevec> I hope middleman has some kind of macros?
15:41:20 <apevec> rbowne is not here, misc ^ do you know?
15:41:59 <number80> I can fix one and add it in all concerned docs
15:42:21 <apevec> yeah, let's do one example first
15:42:49 <apevec> we can also review text wording there
15:43:04 <number80> #action number80 prototype warning about third-party repo
15:43:06 <amoralej> #action number80 will add a third-party note into a doc to review workding and propose
15:43:21 <amoralej> it's undo?
15:43:23 <apevec> #undo
15:43:23 <zodbot> Removing item from minutes: ACTION by amoralej at 15:43:06 : number80 will add a third-party note into a doc to review workding and propose
15:43:24 <number80> yes
15:43:27 <apevec> :)
15:43:29 <apevec> ok, next
15:43:29 <amoralej> :)
15:43:44 <amoralej> #topic chair for next meeting
15:44:05 <amoralej> i don't know who is the colour that proposed himself in the etherpad, :)
15:44:19 <imcsk8> hehehe
15:44:33 <imcsk8> anonymous!
15:44:41 <amoralej> it was you, imcsk8?
15:44:48 <imcsk8> nope
15:44:58 <imcsk8> but i can do it
15:45:23 <chandankumar> oh it was me
15:45:29 <amoralej> ok
15:45:44 <amoralej> #action chandankumar to chair next meeting
15:45:50 <chandankumar> thanks amoralej
15:46:33 <amoralej> so, any other topic?
15:47:02 <rhallisey> EmilienM, what's up? Sorry internet trouble
15:47:04 <apevec> do we want to discuss rabbitmq ?
15:47:11 <EmilienM> slagle: ^
15:47:17 <apevec> #topic open floor
15:47:23 <EmilienM> rhallisey: so tripleo CI is currently broken
15:47:26 <apevec> on the dance floor
15:47:32 <trown> more generally, how do we communicate package updates in the deps repos
15:47:34 <EmilienM> so does Puppet CI but for another rabbitmq problem
15:47:43 <apevec> trown, it was sitting in common-pending
15:47:49 <EmilienM> rhallisey: I found a new AVC with latest rabbit http://logs.openstack.org/46/323546/2/check/gate-puppet-openstack-integration-3-scenario002-tempest-centos-7/28a4f06/console.html#_2016-06-01_14_24_47_770
15:47:51 <trown> it seems every time it happens we break
15:47:53 <apevec> but we don't have that pipeline
15:48:04 <apevec> which would test -testing + -pending
15:48:09 <trown> ya..., I could at least manually test
15:48:14 <apevec> we did run weirdo generic job
15:48:20 <apevec> and it worked there
15:48:28 <apevec> so I'm not sure what's different in tripleo ?
15:48:33 <EmilienM> rhallisey: puppet CI deploys rabbitmq in ssl
15:48:42 <trown> lots :)
15:48:45 <apevec> EmilienM, that AVC is correct, rabbitmq should not write to etc
15:48:55 <EmilienM> apevec: it does not need to write, I don't get it
15:49:07 <EmilienM> rabbit tries to write
15:49:12 <EmilienM> in the ssl certif it uses
15:49:14 <apevec> avc:  denied  { write } for  pid=7758 comm="async_16" name="centos-7-internap-nyj01-1338789.pem" dev="vda1" ino=4702782 scontext=system_u:system_r:rabbitmq_t:s0 tcontext=unconfined_u:object_r:etc_t:s0 tclass=file
15:50:01 <apevec> jpena, packstack doesn't deploy rabbitmq ssl ?
15:50:03 <EmilienM> where can we read changelog of latest rabbit?
15:50:10 <EmilienM> no they don't
15:50:14 <apevec> it's major upgrade
15:50:14 <jpena> apevec: it can use ssl, but not by default
15:50:22 <imcsk8> apevec: not by default
15:50:38 <EmilienM> our CI deploys SSL & ipv6 on centos7 jobs
15:50:39 <apevec> ok, so we need scenario w/ ssl turned
15:50:44 <EmilienM> ++
15:51:17 <rhallisey> EmilienM, I'll check it out
15:51:43 <apevec> so now you'll want revert I guess, but then we'll have to bump epoch... which sucks
15:51:59 <apevec> number80, ^
15:52:15 <slagle> here's the tripleo bug: https://bugs.launchpad.net/tripleo/+bug/1587961
15:52:20 <slagle> different from the puppet AVC
15:52:21 <apevec> as I said, this upgrade should've been done early in RDO Mitaka cycle
15:52:26 <slagle> we are trying a quick patch
15:52:58 <apevec> it's pushed d/s in OSP9 and I wanted to catch up in RDO
15:53:03 <number80> apevec, EmilienM: can't we fix it ? We shouldn't keep old rabbit
15:53:16 <apevec> I'm trying to get relevant developers involved in RDO and develop here first
15:53:18 <trown> there is a quick fix up in tripleo https://review.openstack.org/#/c/323995/1
15:53:31 <trown> of that works, we may not need revert in tripleo
15:54:41 <apevec> ok, that's good news
15:54:41 <trown> I know it is partly on me for not having the generic promote setup yet, but can we at least do a manual test with tripleo before pushing to deps repo in the meantime?
15:54:45 <EmilienM> I have no idea why we have this avc
15:55:00 <number80> trown: that patch sounds like the correct thing to do :)
15:55:01 <EmilienM> I wouldn't block a promotion, we can disable the failure if AVC is found
15:55:30 <apevec> trown, yes, what would a manual test include?
15:56:05 <apevec> would it be run locally or on CI system?
15:56:27 <amoralej> trown, be carefull with https://bugzilla.redhat.com/show_bug.cgi?id=1303803 in ha configurations
15:56:28 <trown> apevec: by manual I mean just running locally
15:56:51 <apevec> trown, ok, what would be the script?
15:57:31 <apevec> number80, I've 64gb box so I can take that before pushing updates
15:57:35 <apevec> (in the future)
15:57:49 <number80> apevec: +2+W
15:58:20 <number80> the max I have is 32 and just enough to test 3o quickstart
15:58:59 <trown> apevec: ok, that still requires me putting something up :), but is a bit easier than the full promote pipeline in jjb
15:59:35 <apevec> would it be one of https://github.com/openstack/tripleo-quickstart/tree/master/ci-scripts ?
16:01:03 <apevec> #action trown to put something up in tripleo-quickstart/tree/master/ci-scripts for local validation of common-pending updates in the future
16:01:09 <apevec> we're over time!
16:01:19 <amoralej> ok
16:01:21 <amoralej> #endmeeting