rdo_meeting_-_2016-07-20
LOGS
15:00:30 <imcsk8> #startmeeting RDO meeting - 2016-07-20
15:00:30 <zodbot> Meeting started Wed Jul 20 15:00:30 2016 UTC.  The chair is imcsk8. Information about MeetBot at http://wiki.debian.org/MeetBot.
15:00:30 <zodbot> Useful Commands: #action #agreed #halp #info #idea #link #topic.
15:00:30 <zodbot> The meeting name has been set to 'rdo_meeting_-_2016-07-20'
15:00:35 <openstack> Meeting started Wed Jul 20 15:00:30 2016 UTC and is due to finish in 60 minutes.  The chair is imcsk8. Information about MeetBot at http://wiki.debian.org/MeetBot.
15:00:36 <openstack> Useful Commands: #action #agreed #help #info #idea #link #topic #startvote.
15:00:38 <openstack> The meeting name has been set to 'rdo_meeting___2016_07_20'
15:00:39 <imcsk8> #topic roll call
15:00:51 <coolsvap> o/
15:01:01 <rbowen> ¯\_(ツ)_/¯
15:01:05 <mengxd> o/
15:01:16 <apevec> o/
15:01:20 <imcsk8> #chair apevec coolsvap jruzicka rbowen mengxd
15:01:20 <zodbot> Current chairs: apevec coolsvap imcsk8 jruzicka mengxd rbowen
15:01:26 <jjoyce> o/
15:01:32 <trown>15:01:48 <imcsk8> #chair jjoyce trown
15:01:48 <zodbot> Current chairs: apevec coolsvap imcsk8 jjoyce jruzicka mengxd rbowen trown
15:02:22 <imcsk8> ok, let's start
15:02:25 <eggmaster> \m/ -_- \m/
15:02:32 <imcsk8> #chair eggmaster
15:02:32 <zodbot> Current chairs: apevec coolsvap eggmaster imcsk8 jjoyce jruzicka mengxd rbowen trown
15:02:47 <apevec> trown, it's too hot for snowman
15:02:55 <imcsk8> #topic newton2 testday readiness
15:02:59 <trown> wishful thinking
15:03:15 <apevec> is that testday readiness summary? :)
15:03:24 <weshay> heh
15:03:24 <trown> lol, it does work there too
15:03:33 <apevec> so we're down to 1 issue
15:03:46 <trown> though I think there is a better chance of promotion than building a snowman outside right now
15:03:51 <trown> so we have that going for us
15:04:02 <weshay> there is a weirdo failure on scen001
15:04:23 <apevec> dmsimard, ^ has it fixed for the next run iiuc ?
15:04:31 <weshay> and a introspection issue  that we've confirmed  https://review.openstack.org/#/c/344792/ fixes
15:04:32 <apevec> it's only puppet scn1
15:04:36 <trown> overcloud deploy just failed on HA as well, which would not be introspection
15:04:46 <weshay> doh
15:04:57 <number80> o/
15:05:00 <trown> waiting on logs
15:05:10 <chandankumar> \o/
15:05:16 <imcsk8> #chair chandankumar
15:05:16 <zodbot> Current chairs: apevec chandankumar coolsvap eggmaster imcsk8 jjoyce jruzicka mengxd rbowen trown
15:05:25 <dmsimard> apevec: yeah, the gerrit replication wasn't working since the gerrit replication revamp for weirdo repositories
15:05:32 <dmsimard> so the fix landed in the gerrit repo but wasn't replicated to github
15:05:34 <dmsimard> fixed it this morning
15:05:35 <trown> ha still has a high transient failure rate, so might not be a new issue
15:05:45 <dmsimard> trown: see my comment re: firewalld and networkmanager
15:05:47 <dmsimard> ?
15:05:59 <dmsimard> trown: it might explain the flapping to a certain extent
15:06:32 <trown> dmsimard: hmm, maybe you could uninstall those at the beginning of weirdo?
15:06:34 <jtomasek> o/
15:06:43 <imcsk8> #chair jtomasek
15:06:43 <zodbot> Current chairs: apevec chandankumar coolsvap eggmaster imcsk8 jjoyce jruzicka jtomasek mengxd rbowen trown
15:06:46 <florianf> o/
15:06:59 <imcsk8> #chair florianf
15:06:59 <zodbot> Current chairs: apevec chandankumar coolsvap eggmaster florianf imcsk8 jjoyce jruzicka jtomasek mengxd rbowen trown
15:07:02 <dmsimard> trown: would ideally not manage that in weirdo (i.e, bake it in the image for review.rdo and do it some other way for ci.centos)
15:07:12 <dmsimard> but yeah, we should definitely try it
15:07:14 <trown> k
15:07:20 <jrist> #chair jrist
15:07:20 <dmsimard> see if that fixes some flapping
15:07:44 <number80> jrist: only chairs can chair
15:07:51 <jrist> oh :)
15:07:52 <jrist> o/
15:08:03 <imcsk8> #chair jrist
15:08:03 <zodbot> Current chairs: apevec chandankumar coolsvap eggmaster florianf imcsk8 jjoyce jrist jruzicka jtomasek mengxd rbowen trown
15:08:32 <apevec> so summary for the meeting minutes is: not ready but have a good chance?
15:09:19 <dmsimard> yeah and
15:09:20 <apevec> we can retrigger after introspection fix is merged and built in dlrn
15:09:41 <dmsimard> #action dmsimard to investigate if removing firewalld and networkmanager from the default centos installations can help alleviate flapping results
15:09:44 <apevec> (18. in current issues)
15:10:13 <apevec> dmsimard, ^ is there more info on that? where have you seen it in logs?
15:10:38 <apevec> NM was supposed to work fine w/ Packstack
15:10:42 <apevec> imcsk8, ^ ?
15:10:48 <dmsimard> apevec: it's part intuition, part it's documented to remove those anyway, part these are not installed upstream
15:11:07 <imcsk8> apevec: i've been testing packstack without disabling NM for a while now with no problems
15:11:14 <apevec> imcsk8, what about firewalld ?
15:11:27 <apevec> it could be disabled w/o removing?
15:11:35 <imcsk8> firewalld has to be disabled at least
15:13:59 <apevec> ok, who will watch over 18. and retrigger promotion pipeline when it gets build in RDO Trunk ?
15:14:48 <imcsk8> apevec, dmsimard https://github.com/puppetlabs/puppetlabs-firewall/blob/master/manifests/linux/redhat.pp#L29
15:14:50 <apevec> oh it is not +W yet? https://review.openstack.org/#/c/344792/
15:15:00 <imcsk8> the firewall puppet module disables firewalld
15:15:04 <trown> #action trown babysit instack-undercloud patch
15:15:17 <trown> apevec: no it is not passing upstream CI, because upstream CI issues
15:15:59 <number80> seriously?
15:16:06 <number80> (the firewalld thing)
15:16:08 <weshay> ya.. downloading packages
15:16:12 <weshay> weee
15:16:32 <trown> anything else on this topic?
15:16:56 <imcsk8> ok, let's continue
15:17:30 <imcsk8> #topic RDO CI: POWER nodes sizing (w/ mengxd)
15:18:09 <mengxd> yes, i want to discuss with team to understand the h/w requirements for RDO CI
15:18:29 <mengxd> https://ci.centos.org/view/rdo/view/all/
15:18:44 <weshay> k
15:19:17 <mengxd> from the above link, i can see there are about 5 nodes used for RDO ci. Is this number correct?
15:20:08 <weshay> nodes?
15:20:20 <mengxd> physical server
15:20:53 <mengxd> that is the physical servers in the node pool in the CI system, right?
15:20:53 <weshay> we have the slave governed to start 15 max at any time
15:21:13 <weshay> mengxd, atm I see all 15 running
15:21:46 <weshay> so there will be at least 15 servers/nodes reserved by rdo-ci atm.. plus any that have "leaked"
15:21:50 <mengxd> weshay: i think that 15 is the CI pipelines, not the nodes
15:22:11 <weshay> 15 physical servers
15:22:47 <dmsimard> mengxd, weshay: We're in the process of adding more slave capacity
15:22:48 <mengxd> i saw 15 pipelines under rdo-ci-slave01
15:22:57 <dmsimard> mostly to distribute the load and have redundancy
15:23:15 <weshay> dmsimard, ya.. we should then limit each to 5 jobs
15:23:15 <weshay> if we end up w/ 3
15:23:15 <weshay> 3 slaves
15:23:16 <dmsimard> We will lower the amount of threads on the current slave and use 3 more slaves (2 are under testing right now)
15:23:20 <weshay> mengxd, not sure what you mean re: pipelines
15:23:29 <weshay> dmsimard++
15:23:30 <zodbot> weshay: Karma for dmsimard changed to 3 (for the f24 release cycle):  https://badges.fedoraproject.org/tags/cookie/any
15:24:04 <weshay> mengxd, each job when running uses 1 physical server
15:24:32 <mengxd> weshay: do we really need one physical for a job? i thought they are running on top of VMs
15:24:40 <weshay> mengxd, yes we need that
15:25:01 <dmsimard> mengxd: there will be an openstack cloud available for virtual workloads
15:25:05 <dmsimard> soon
15:25:07 <weshay> these jobs are deploying openstack on vms on top of the physical server
15:25:29 <dmsimard> mengxd: we try not to run jobs directly on the slaves as they are static, we run the jobs on ephemeral nodes
15:25:49 <weshay> hopefully we have no jobs running on the slaves
15:25:59 <mengxd> ok, then if i need to enable RDO on ppc64le, what is the minimal h/w requirements? which will trigger the ci job?
15:26:07 <dmsimard> weshay: we might have some things.. like lint jobs or things like that
15:26:33 <dmsimard> mengxd: we don't have ppc64le available on the ci.centos environment, I don't think
15:26:37 <dmsimard> mengxd: would need to check.
15:26:53 <mengxd> you are right, but i have an interest to enable that
15:27:07 <mengxd> so i want to get some idea about the h/w requirements if we do so.
15:27:11 <dmsimard> mengxd: okay, right
15:27:21 <weshay> http://docs.openstack.org/developer/tripleo-docs/environments/virtual.html
15:27:38 <weshay> for CI, we need to test w/ HA.. which requires 64gb of memory on the host
15:27:44 <weshay> same w/ upgrades
15:27:47 <imcsk8> #chair weshay
15:27:47 <zodbot> Current chairs: apevec chandankumar coolsvap eggmaster florianf imcsk8 jjoyce jrist jruzicka jtomasek mengxd rbowen trown weshay
15:27:48 <number80> please provide multiple scenarios
15:28:00 <dmsimard> mengxd: This is the documentation for the hardware currently on ci.centos.org: https://wiki.centos.org/QaWiki/PubHardware
15:28:08 <imcsk8> #chair number80
15:28:08 <zodbot> Current chairs: apevec chandankumar coolsvap eggmaster florianf imcsk8 jjoyce jrist jruzicka jtomasek mengxd number80 rbowen trown weshay
15:28:28 <imcsk8> #chair dmsimard
15:28:28 <zodbot> Current chairs: apevec chandankumar coolsvap dmsimard eggmaster florianf imcsk8 jjoyce jrist jruzicka jtomasek mengxd number80 rbowen trown weshay
15:28:43 <dmsimard> mengxd: Jobs are currently running on nodes with 32GB ram, 8 cores and disk space varies between 200GB to 500GB I believe
15:28:43 <mengxd> so from the 1st link, it seems one big power server is enough for triple-o test
15:29:03 <dmsimard> mengxd: some jobs are designed to fit within 8GB of ram, others really require  32GB at the very least.
15:29:31 <weshay> dmsimard, mengxd really 32gb is min, 64 is ideal
15:29:45 <dmsimard> weshay: there's no 64GB anywhere on ci.centos, where's that number from ?
15:29:50 <weshay> w/o 64 we can't test a sudo supported deployment
15:29:59 <weshay> dmsimard, aye I know
15:30:13 <jrist> in fact he's probably the most familiar, ha
15:30:13 <number80> mengxd: we'll need also enough capacity for gating DLRN changes (I mean packaging changes)
15:30:34 <dmsimard> number80: that's from review.rdo though
15:30:45 <number80> dmsimard: well, we can plug external CI
15:30:48 <mengxd> so what will trigger a RDO CI job now?
15:30:49 <dmsimard> number80: though it could be third party
15:30:56 <imcsk8> isn't 65Gb too much??
15:31:02 <imcsk8> (64)
15:31:04 <dmsimard> imcsk8: tripleo.
15:31:06 <number80> for 3o? nope
15:31:06 <weshay> mengxd, an update to a yum repo
15:31:23 <weshay> mengxd, and we also have periodic triggers
15:31:26 <number80> actually 32Go is barely enough
15:31:30 <mengxd> btw, i can run full Tempest with 16GB memory on a CentOS VM, not sure why we need 64GB here
15:31:33 <dmsimard> mengxd: we periodically check if new RDO packages have been built and if so, we trigger a series of jobs to check if those packages work well.
15:31:48 <dmsimard> mengxd: tripleo has particular requirements
15:31:49 <mengxd> are you running full Tempest?
15:32:04 <dmsimard> mengxd: packstack and puppet-openstack don't require more than 8GB of RAM
15:32:06 <weshay> mengxd, for HA 3 controller, 2 compute 1 ceph is the min official support arch
15:32:15 <weshay> we don't attempt that today
15:32:20 <weshay> but that is the CI requirement
15:32:33 <weshay> we work w/ in the current hardware atm
15:32:38 <weshay> and test the rest else where
15:33:22 <mengxd> ok, so even for triple-O, we can test RDO with VMs (nested virtualization), right?
15:33:38 <weshay> we're very happy w/ what we have.. but those our the requirements we've been given
15:33:46 <dmsimard> mengxd: yes, we test with nested virtualization in the review.rdoproject.org and the review.openstack.org environment.
15:33:56 <dmsimard> mengxd: well, wait, I read that wrong
15:34:28 <dmsimard> mengxd: tripleo does nested virt itself (through tripleo-quickstart), the job is not designed to run on a VM (since then you'll end up with ultimately 3 layers of nested virt)
15:34:47 <weshay> dmsimard, that does work though
15:34:57 <dmsimard> weshay: quickstart on VMs ? must be slow, no ?
15:35:07 <weshay> ya.. not saying it's ideal.. but it works
15:35:11 <dmsimard> ok
15:35:20 <number80> that explains why jobs are so long
15:35:28 <mengxd> ok, usually how long will each CI job take?
15:35:32 <dmsimard> number80: which jobs ?
15:35:38 <number80> dmsimard: tripleo
15:35:48 <number80> mengxd: 3 to 5 hours
15:36:00 <dmsimard> mengxd: packstack and puppet openstack finish within 45 minutes, tripleo takes several hours
15:36:03 <weshay> mengxd, w/ quickstart a min job is 1:15 and upgrade or scale can run as much as 3.5 hrs
15:36:11 <dmsimard> weshay: are you including the image build in that time ?
15:36:27 <weshay> dmsimard, image build is only done in the promotion pipeline
15:36:35 <dmsimard> fair
15:36:51 <number80> we manage to get average below 3hours?
15:36:53 <mengxd> ok, and how frequently will CI job be triggered?
15:36:54 <number80> woot
15:37:23 <weshay> we are working on a downloading an already deployed stack.  and restarting it..
15:37:26 <dmsimard> mengxd: well, that's up to you I guess ? I'm not sure where you want this job and what you want it to test
15:37:38 <weshay> so that upgrades, scale etc. don't have to deploy the initial cloud each time
15:37:52 <weshay> which will bring run times down to 1.5 hrs for upgrades which is our longest job
15:38:05 <weshay> it's only WIP atm
15:38:41 <mengxd> weshay: that is really nice to have. since it can save a lot of time.
15:38:50 <weshay> agree.. upgrades are terrible
15:39:24 <dmsimard> I have to step out for an appointment, I'll be afk for a bit
15:39:35 <mengxd> dmsimard: how about the current RDO CI? Is it reporting to every community patch-set?
15:39:43 <weshay> mengxd, no
15:40:07 <weshay> mengxd, we poll the git repo every 4hrs or so.. check if there is a change.. if true; then execute
15:40:15 <weshay> other jobs are configured to run once a day
15:40:30 <weshay> for instance
15:40:30 <weshay> https://ci.centos.org/view/rdo/view/promotion-pipeline/
15:40:33 <weshay> are triggered off the yum repo
15:40:48 <weshay> https://ci.centos.org/view/rdo/view/tripleo-gate/  are triggered off of changes to CI src..
15:40:54 <weshay> so every patch
15:41:02 <weshay> https://ci.centos.org/view/rdo/view/tripleo-periodic/
15:41:06 <weshay> are triggered once a day
15:41:14 <imcsk8> guys, i think we're getting a little off topic and we have other stuff to address
15:41:28 <weshay> sure
15:41:42 <mengxd> weshay: so i just want to get an estimate about the CI h/w reqs
15:41:53 <mengxd> maybe we can talk in mailing list
15:42:37 <jruzicka> or here after the meeting ;)
15:42:43 <mengxd> sure
15:42:43 <weshay> mengxd, sure np.. we're very grateful for what we have in ci.centos but the requirements of tripleo are what they are :(
15:44:15 <imcsk8> is there anything else? do you want an action for continue this topic on the ML?
15:44:55 <mengxd> yes, i will send out a note to mailing list for further discussion
15:45:23 <imcsk8> #action mengxd to send a message to the Mailing List about CI requirements
15:45:25 <mengxd> so we can move on with other topics
15:45:36 <imcsk8> next topic
15:45:48 <imcsk8> #topic tripleo-ui packaging
15:46:47 <imcsk8> jrist: ^^
15:47:01 <jrist> honza, jtomasek, florianf
15:47:04 <jrist> so yeah
15:47:12 <jrist> thanks imcsk8
15:47:18 <jrist> we've got to get tripleo-ui packaged
15:47:26 <jrist> and we've got some first steps with upstream openstack
15:47:27 <imcsk8> #chair jrist
15:47:27 <zodbot> Current chairs: apevec chandankumar coolsvap dmsimard eggmaster florianf imcsk8 jjoyce jrist jruzicka jtomasek mengxd number80 rbowen trown weshay
15:47:42 <jrist> but there is a concern for what we need to do for compilation
15:47:51 <jrist> i.e. all of the possible dependencies
15:48:00 <jrist> note, tripleo-ui is npm/react based
15:48:03 <jruzicka> which dependencies are problematic in particular?
15:48:15 <jrist> well, it is npm based, so there are many npm packages
15:48:29 <jrist> jruzicka: we would like to understand what might be already packaged
15:48:39 <jrist> or if there is an npm repo that we can work from, instead of packaging
15:48:42 <jrist> if that is not possible
15:48:53 <jrist> we will have to package the dependencies that are not already in RPM f
15:49:36 <jrist> does anyone have any insight? is this something we can or need to set up another
15:49:44 <jrist> to not derail the #rdo meeting
15:50:10 <jtomasek> in general, we're talking hundreds of dependencies, since the nature of how npm packages work
15:50:20 <jrist> last count was 856, but it will reduce a little
15:50:24 <jtomasek> whole ecosystem is very similar to this https://fedoraproject.org/wiki/KojiMavenSupport
15:50:41 <florianf> to give a bit of background: this discussion has been going on for some days now on various channels. Initially we were hoping to find a way to deliver the compiled/minified JS packages with the UI package.
15:50:51 <jrist> honza mentioned that there might that there might be a fedora NPM registry
15:50:58 <jrist> but that it won't be ready until next year
15:51:01 <jtomasek> an approved npm registry where the dependencies could be sourced would be a nice solution
15:51:31 <honza> i think it was number80 who suggested we might be able to get away with only packaging the build toolchain to start and work on the rest later
15:52:51 <number80> yes
15:52:54 <apevec> honza, yes, how big is the toolchain?
15:53:06 <jtomasek> problem is that the build toolchain is most of the deps
15:53:12 <jtomasek> say 500
15:53:13 <honza> apevec: i'll let jtomasek answer that one
15:53:37 <jtomasek> maybe less if we don't need to include testing tools
15:53:53 <apevec> do you have a dep tree? this sounds insane :)
15:54:00 <sshnaidm> apevec, hi there
15:54:18 <jruzicka> yeah, sounds pretty instane if no npm rpms are available ATM
15:54:21 <sshnaidm> apevec, do you have a time for delorean issue talk?
15:54:22 <number80> jtomasek: we can ignore testing tools, and not request strict unbundling for toolchain
15:54:37 <apevec> sshnaidm, we're in the meeting
15:55:02 <jtomasek> number80: that means that we would not need to package dependencies of the toolchain dependencies?
15:55:34 <jtomasek> this is the list of the direct app dependencies https://github.com/openstack/tripleo-ui/blob/master/package.json
15:56:08 <number80> jtomasek: top priority is to build from sources and have the toolchain available
15:56:23 <number80> unbundling will be ongoing work but not a blocker for this package
15:56:57 <number80> (and if this lands in RHOSP, you'd have to do it anyway)
15:57:18 <jtomasek> this is the full dependency tree http://paste.openstack.org/show/538860/
15:57:30 * jrist winces
15:57:44 <jtomasek> we're in process of cutting down some of them but not a lot
15:57:47 <florianf> number80: this is supposed to land in RHOSP10
15:57:50 <jruzicka> what a magnificient tree!
15:57:54 <jrist> jruzicka: :)
15:57:55 <imcsk8> !!
15:58:13 <imcsk8> guys, we're almost at the top of the our. can we proceed with the next topic?
15:58:31 <jtomasek> ok
15:58:36 * chandankumar thinks https://github.com/ralphbean/npm2spec it might be useful for creating npm specs for packages
15:58:43 <jrist> thanks chandankumar
15:58:48 <honza> chandankumar: nice
15:58:54 <jrist> imcsk8: sounds like we need to have another separate meeting. thanks
15:58:54 <imcsk8> #topic rdopkg 0.38 released
15:59:06 <jruzicka> yeah, that's just a quick info
15:59:37 <jruzicka> new version contains bugfixes, cbsbuild command by number80 and 1000 less sloc of obsolete code
15:59:41 <jruzicka> number80++
15:59:50 <imcsk8> nice!
15:59:59 <jruzicka> 18.5 % of code is gone ;)
16:00:08 <imcsk8> cool!!
16:00:18 <chandankumar> sweet!
16:00:27 <jruzicka> so let me know if something broke as always and that's all :)
16:00:34 <imcsk8> ok, next
16:00:38 <imcsk8> #topic     Chair for next meetup
16:00:42 <number80> jruzicka: awesome
16:01:17 <chandankumar> i am up for chairing.
16:01:23 <rdogerrit> hguemar proposed DLRN: Add rdopkg reqcheck output in CI runs  http://review.rdoproject.org/r/1275
16:01:36 <imcsk8> #action chandankumar to chair next meeting
16:01:57 <imcsk8> #topic open floor
16:02:10 <imcsk8> is there anything else? or should we finish?
16:03:05 <imcsk8> ok, closing meeting
16:03:08 <imcsk8> 1
16:03:10 <imcsk8> 2
16:03:11 <imcsk8> 3
16:03:15 <jruzicka> C-C-COMBO BREAKER
16:03:18 <chandankumar> 4
16:03:20 <imcsk8> #endmeeting