--- Log opened do feb 25 00:00:07 2016 00:03 -!- fsimonce: has quit [Quit: Coyote finally caught me] 00:48 -!- rmatinata (Ricardo Marin Matinata): has joined #vdsm 01:03 -!- PaulMaidment (Paul Maidment): has joined #vdsm 01:03 -!- PaulMaidment: has quit [Client Quit] 03:08 -!- ybronhei: has quit [Ping timeout: 240 seconds] 03:57 -!- hchiramm_ (hchiramm): has joined #vdsm 04:28 -!- shubhendu (Shubhendu Tripathi): has joined #vdsm 06:01 -!- ndarshan (Darshan n): has joined #vdsm 06:36 -!- ishaby (Idan Shaby): has joined #vdsm 06:38 -!- shubhendu: has quit [Ping timeout: 240 seconds] 06:50 -!- shubhendu (Shubhendu Tripathi): has joined #vdsm 06:56 -!- shubhendu: has quit [Ping timeout: 252 seconds] 06:58 -!- gpadgett: has quit [Ping timeout: 255 seconds] 07:10 -!- shubhendu (Shubhendu Tripathi): has joined #vdsm 07:16 -!- shubhendu: has quit [Ping timeout: 240 seconds] 07:18 -!- tim (Tim): has joined #vdsm 07:20 -!- edwardh (purple): has joined #vdsm 07:29 -!- shubhendu (Shubhendu Tripathi): has joined #vdsm 07:35 -!- fabiand (Fabian Deutsch): has joined #vdsm 07:39 -!- nsoffer (Nir Soffer): has joined #vdsm 07:48 -!- hchiramm_: has quit [Read error: Connection reset by peer] 07:50 -!- shubhendu: has quit [Remote host closed the connection] 07:52 -!- shubhendu (Shubhendu Tripathi): has joined #vdsm 08:18 -!- sbonazzo (purple): has joined #vdsm 08:19 -!- apuimedo: has quit [Ping timeout: 240 seconds] 08:19 -!- pkliczew (Piotr Kliczewski): has joined #vdsm 08:26 -!- derez_ (Daniel Erez): has joined #vdsm 08:29 -!- mmirecki (Marcin Mirecki): has joined #vdsm 08:43 -!- mzamazal (Milan Zamazal): has joined #vdsm 08:45 -!- fromani (Francesco Romani): has joined #vdsm 08:46 -!- hchiramm (hchiramm): has joined #vdsm 08:59 -!- rmohr (Roman Mohr): has joined #vdsm 09:01 -!- vered (Vered Volansky): has joined #vdsm 09:11 -!- nsoffer: has quit [Quit: Segmentation fault (core dumped)] 09:28 -!- fabiand: has quit [Quit: Verlassend] 09:30 -!- fabiand (Fabian Deutsch): has joined #vdsm 09:33 -!- fsimonce (Federico): has joined #vdsm 09:39 -!- ybronhei (purple): has joined #vdsm 09:47 -!- hchiramm: has quit [Remote host closed the connection] 10:00 -!- mmirecki: has quit [Ping timeout: 240 seconds] 10:02 -!- edwardh: has quit [Ping timeout: 248 seconds] 10:06 -!- jbrooks: has quit [Ping timeout: 240 seconds] 10:09 -!- acanan (Aharon Canan): has joined #vdsm 10:12 -!- mmirecki (Marcin Mirecki): has joined #vdsm 10:16 -!- hchiramm (hchiramm): has joined #vdsm 10:24 -!- edwardh (purple): has joined #vdsm 10:27 -!- jbrooks (Jason Brooks): has joined #vdsm 10:43 -!- fabiand: has quit [Quit: Verlassend] 11:03 -!- mzamazal: has quit [Remote host closed the connection] 11:06 -!- mzamazal (Milan Zamazal): has joined #vdsm 11:08 -!- mzamazal: has quit [Remote host closed the connection] 11:08 -!- edwardh: has quit [Ping timeout: 244 seconds] 11:12 -!- mzamazal (Milan Zamazal): has joined #vdsm 11:25 -!- edwardh (purple): has joined #vdsm 11:30 -!- osvoboda (osvoboda): has joined #vdsm 11:36 < osvoboda> ybronhei, fromani: Hi guys, may I ask you to merge https://gerrit.ovirt.org/#/c/53882/ and then https://gerrit.ovirt.org/#/c/53942/ to 3.6? 11:38 < fromani> osvoboda: done 11:38 < osvoboda> fromani: Thanks Francesco! And how are you? 11:39 < fromani> ho Ondra, I was in Brno a couple weeks ago (around devconf WE), seems like we didn't catch each other 11:39 < fromani> hi* 11:39 < fromani> osvoboda: ^^^^ 11:40 < osvoboda> fromani: I was still in England. Did you like devconf, or Brno more? :-) 11:40 < fromani> osvoboda: both :) 11:41 < osvoboda> fromani: Glad to hear! Be sure to come back soon! 11:41 < fromani> osvoboda: sure, I'd love to :) 11:41 < osvoboda> fromani: It was nice talking to you last time :-) 11:42 < fromani> osvoboda: same here, I also got nice weather (no snow, no excessive cold) 11:42 < fromani> osvoboda: so definitely a very nice visit 11:43 < osvoboda> Snow-covered Brno looks better but if only it were warm at the same time ;-) 11:44 < fromani> _and_ if you don't have to catch a bus/train to get back in Prague :) 11:47 < osvoboda> fromani: Haha, no great escape this year :-D 11:52 -!- mode/#vdsm: by ChanServ 11:52 -!- danken (purple): has joined #vdsm 12:27 -!- mmirecki is now known as mmirecki-lunch 12:33 < edwardh> ybronhei, fromani: Could you please merge https://gerrit.ovirt.org/#/q/status:open+project:vdsm+branch:ovirt-3.6+topic:ifcfg-persistence-36 ? 12:33 < fromani> edwardh: on it 12:33 < edwardh> fromani: thanks 12:34 -!- pkliczew is now known as pkliczew|lunch 12:44 -!- fabiand (Fabian Deutsch): has joined #vdsm 12:51 -!- edwardh: has quit [Remote host closed the connection] 12:56 -!- tim: has quit [Ping timeout: 255 seconds] 13:06 -!- rmohr: has quit [Ping timeout: 248 seconds] 13:14 -!- osvoboda: has quit [Ping timeout: 255 seconds] 13:34 -!- acanan: has quit [Ping timeout: 276 seconds] 13:38 -!- edwardh (purple): has joined #vdsm 13:45 -!- mmirecki-lunch is now known as mmirecki 13:46 -!- acanan (Aharon Canan): has joined #vdsm 13:46 -!- pkliczew|lunch is now known as pkliczew 14:03 -!- apuimedo (Antoni Segura Puimedon): has joined #vdsm 14:07 -!- danken: has quit [Ping timeout: 255 seconds] 14:13 -!- phoracek (phoracek): has joined #vdsm 14:16 -!- acanan: has quit [Ping timeout: 244 seconds] 14:23 -!- acanan (Aharon Canan): has joined #vdsm 14:24 -!- mode/#vdsm: by ChanServ 14:24 -!- danken (purple): has joined #vdsm 14:25 -!- apahim_ (Amador Pahim): has joined #vdsm 14:26 -!- apahim: has quit [Ping timeout: 240 seconds] 14:27 -!- bazulay: has left #vdsm [] 14:29 -!- acanan_ (Aharon Canan): has joined #vdsm 14:30 -!- bazulay: has left #vdsm [] 14:33 -!- acanan: has quit [Ping timeout: 250 seconds] 14:34 -!- nsoffer (Nir Soffer): has joined #vdsm 14:52 -!- amarchuk: has quit [Quit: Leaving] 14:53 -!- amarchuk (Anton Marchukov): has joined #vdsm 15:03 -!- danken1 (purple): has joined #vdsm 15:03 -!- mode/#vdsm: by ChanServ 15:03 -!- danken: has quit [Read error: Connection reset by peer] 15:08 -!- danken1: has quit [Ping timeout: 240 seconds] 15:27 -!- tiraboschi_ (purple): has joined #vdsm 15:27 < tiraboschi_> nsoffer: Hi Nir 15:27 < tiraboschi_> nsoffer: can I disturb you a few minutes? 15:28 < nsoffer> tiraboschi_, few 15:28 < tiraboschi_> ok, it's still for the upgrade from he35 -> he36 15:28 < tiraboschi_> we saw an issue on our QE env that I'm not able to reproduce 15:29 < tiraboschi_> from 3.5 we have 15:29 < tiraboschi_> spUUID=7c304cac-abb3-400b-a77c-ef1910d7cb53 15:29 < tiraboschi_> sdUUID=3f1dd469-3bf3-4c20-af88-a0e0dc551072 15:29 < tiraboschi_> and the SD is: 15:29 < tiraboschi_>:# vdsClient -s 0 getStorageDomainInfo 3f1dd469-3bf3-4c20-af88-a0e0dc551072 15:29 < tiraboschi_> uuid = 3f1dd469-3bf3-4c20-af88-a0e0dc551072 15:29 < tiraboschi_> vguuid = 2DOIFE-maeZ-yIGH-LE46-vh7C-jflJ-m8bFhZ 15:29 < tiraboschi_> state = OK 15:29 < tiraboschi_> version = 3 15:29 < tiraboschi_> role = Regular 15:29 < tiraboschi_> type = ISCSI 15:29 < tiraboschi_> class = Data 15:29 < tiraboschi_> pool =: 15:29 < tiraboschi_> name = hosted_storage 15:29 < tiraboschi_> he issue is that connectStoragePool fails with 'Wrong master or ...' 15:30 < tiraboschi_> nsoffer: the issue, sorry 15:31 < nsoffer> tiraboschi_, when do you connect storage pool? 15:31 -!- hchiramm: has quit [Ping timeout: 240 seconds] 15:31 < tiraboschi_> nsoffer: in the upgrade procedure to create the new volume for the configuration 15:32 < nsoffer> before import? 15:32 < tiraboschi_> the commands seams correct to me 15:32 < tiraboschi_> Thread-3640::INFO::2016-02-25 16:30:18,257::logUtils::48::dispatcher::(wrapper) Run and protect: connectStoragePool(spUUID='7c304cac-abb3-400b-a77c-ef1910d7cb53', hostID=1, msdUUID='3f1dd469-3bf3-4c20-af88-a0e0dc551072', masterVersion=1, domainsMap={'3f1dd469-3bf3-4c20-af88-a0e0dc551072': 'active'}, options=None) 15:32 < tiraboschi_> nsoffer: before import 15:32 < nsoffer> what is the master domain? 15:32 < nsoffer> not the hosted engine one, right? 15:32 < tiraboschi_> nsoffer: no, the strange things is here 15:33 < tiraboschi_> nsoffer: that boostrap storage pool contains just that SD 15:33 < tiraboschi_> the error I got is 15:33 < tiraboschi_> Thread-3640::ERROR::2016-02-25 16:30:18,568::sp::1446::Storage.StoragePool::(setMasterDomain) Requested master domain 3f1dd469-3bf3-4c20-af88-a0e0dc551072 is not a master domain at all 15:33 < tiraboschi_> nsoffer: but no other SD could be the master 15:33 < nsoffer> "is not a master domain at all"? 15:34 < nsoffer> is this log from vdsm? 15:34 < tiraboschi_> nsoffer: yes, it is 15:34 < nsoffer> role = Regular 15:34 < tiraboschi_> on the pool metadata I see: 15:34 < tiraboschi_> Thread-3640::DEBUG::2016-02-25 16:30:18,567::persistentDict::234::Storage.PersistentDict::(refresh) read lines (VGTagMetadataRW)=['CLASS=Data', 'DESCRIPTION=hosted_storage', 'IOOPTIMEOUTSEC=10', 'LEASERETRIES=3', 'LEASETIMESEC=60', 'LOCKPOLICY=ON', 'LOCKRENEWALINTERVALSEC=5', 'LOGBLKSIZE=512', 'MASTER_VERSION=1', 'PHYBLKSIZE=4096', 'POOL_DESCRIPTION=hosted_datacenter', 'POOL_DOMAINS=3f1dd469-3bf3-4c20-af88-a0e0dc551072 15:34 < nsoffer> maybe this is the issue? 15:35 -!- ndarshan: has quit [Quit: Leaving] 15:35 < tiraboschi_> nsoffer: it could be, but if that SD is regular, who is the master? 15:35 < nsoffer> tiraboschi_, do you know how to reproduce this without hosted engine? 15:36 < tiraboschi_> nsoffer: no, on my side it doesn't reproduce also without hosted-engine 15:36 < tiraboschi_> nsoffer: sorry also on hosted-engine 15:37 < nsoffer> tiraboschi_, maybe the the domain on this setup was not accessible when vdsm tried to update the role 15:37 < nsoffer> tiraboschi_, for example, during negative flows tests 15:37 < tiraboschi_> nsoffer: how can I tell that? 15:37 < tiraboschi_> nsoffer: would you like to access that host? 15:37 < nsoffer> tiraboschi_, I think on engine side, engine will do reconstractMaster flow in this case 15:38 < nsoffer> tiraboschi_, try to talk with engine storage guy about this case, and what is the way to recover this pool 15:38 < tiraboschi_> nsoffer: should I try to manually recover it? 15:38 < nsoffer> tiraboschi_, not manually, using vdsm apis 15:38 < tiraboschi_> nsoffer: yes, of course... :-) 15:39 < nsoffer> tiraboschi_, talk with Liron or Daniel about this 15:39 < tiraboschi_> nsoffer: ok, thanks 15:39 < nsoffer> tiraboschi_, I guess that on normal setup, you always get 15:39 < nsoffer> role = Master 15:39 < tiraboschi_> nsoffer: exactly... 15:40 < nsoffer> so we you can check the domain role, and if it is not master, do the recovery 15:40 < tiraboschi_> ok, let me try 15:40 < nsoffer> tiraboschi_, probably you need to call reconstructMaster 15:41 < nsoffer> tiraboschi_, to see how engine calls it, you can setup two domains 15:41 < nsoffer> then you put the master domain to maintenance 15:41 < nsoffer> tiraboschi_, engine will reconstruct master on the other domain 15:42 < tiraboschi_> nsoffer: afaik engine from 3.5 shouldn't see the hosted-engine SD and neither the hosted-engine SP 15:42 < nsoffer> tiraboschi_, all this will be gone in 4.x - there will be no master domain 15:43 < tiraboschi_> nsoffer: I hope so :-) 15:43 < nsoffer> tiraboschi_, I'm talking about another setup with two domains, no hosted engine 15:43 < nsoffer> tiraboschi_, just to see how engine handles this 15:44 -!- hchiramm (hchiramm): has joined #vdsm 15:52 -!- tiraboschi_: has quit [Remote host closed the connection] 15:53 -!- tiraboschi_ (purple): has joined #vdsm 15:53 -!- rmohr (Roman Mohr): has joined #vdsm 15:54 -!- mode/#vdsm: by ChanServ 15:54 -!- danken (purple): has joined #vdsm 15:58 < tiraboschi_> nsoffer: probably the issue is here: vdsm.log:Thread-112::INFO::2016-02-25 16:06:06,112::logUtils::48::dispatcher::(wrapper) Run and protect: reconstructMaster(spUUID='7c304cac-abb3-400b-a77c-ef1910d7cb53', poolName='hosted_engine', masterDom='d798e49f-7108-4db7-b740-59e438c27e19', domDict={'3f1dd469-3bf3-4c20-af88-a0e0dc551072': 'active', 'd798e49f-7108-4db7-b740-59e438c27e19': 'active'}, masterVersion=1, lockPolicy=N 15:59 < tiraboschi_> nsoffer: something send a reconstructMaster command changing the masterSD... 15:59 < tiraboschi_> nsoffer: now I've to check who is the killer... 15:59 < nsoffer> tiraboschi_, someone was accessing the bootstrap sd? 16:00 < tiraboschi_> nsoffer: maybe it's a race conditions with the other HE host 16:00 < nsoffer> tiraboschi_, what is d798e49f-7108-4db7-b740-59e438c27e19? 16:00 -!- rmatinata: has quit [Ping timeout: 240 seconds] 16:00 < tiraboschi_> nsoffer: I don't see it on this host 16:00 < nsoffer> tiraboschi_, the bootstrap pool have more then one sd? 16:01 < tiraboschi_> nsoffer: it shouldn't 16:02 < nsoffer> tiraboschi_, other he host with running engine? 16:02 < tiraboschi_> nsoffer: exactly 16:02 < nsoffer> tiraboschi_, but engine does not know anything about this pool or domain, right? 16:03 < tiraboschi_> nsoffer: I understood 16:03 < tiraboschi_> MainThread::INFO::2016-02-25 16:06:32,964::upgrade::298::ovirt_hosted_engine_ha.lib.upgrade.StorageServer::(_connectStoragePool) Connecting storage pool - master 'd798e49f-7108-4db7-b740-59e438c27e19' - dom_dict '{'3f1dd469-3bf3-4c20-af88-a0e0dc551072': 'active', 'd798e49f-7108-4db7-b740-59e438c27e19': 'active'}' 16:03 < tiraboschi_> MainThread::INFO::2016-02-25 16:06:33,522::upgrade::668::ovirt_hosted_engine_ha.lib.upgrade.StorageServer::(_spmStart) spmStart 16:03 < tiraboschi_> MainThread::INFO::2016-02-25 16:06:33,522::upgrade::658::ovirt_hosted_engine_ha.lib.upgrade.StorageServer::(_isSPM) isSPM 16:03 < tiraboschi_> MainThread::INFO::2016-02-25 16:06:33,567::upgrade::658::ovirt_hosted_engine_ha.lib.upgrade.StorageServer::(_isSPM) isSPM 16:03 < tiraboschi_> MainThread::INFO::2016-02-25 16:06:35,588::upgrade::658::ovirt_hosted_engine_ha.lib.upgrade.StorageServer::(_isSPM) isSPM 16:03 < tiraboschi_> MainThread::INFO::2016-02-25 16:06:37,654::upgrade::658::ovirt_hosted_engine_ha.lib.upgrade.StorageServer::(_isSPM) isSPM 16:03 < tiraboschi_> MainThread::INFO::2016-02-25 16:06:39,856::upgrade::658::ovirt_hosted_engine_ha.lib.upgrade.StorageServer::(_isSPM) isSPM 16:03 < tiraboschi_> MainThread::INFO::2016-02-25 16:06:41,915::upgrade::658::ovirt_hosted_engine_ha.lib.upgrade.StorageServer::(_isSPM) isSPM 16:03 < tiraboschi_> MainThread::INFO::2016-02-25 16:06:44,133::upgrade::658::ovirt_hosted_engine_ha.lib.upgrade.StorageServer::(_isSPM) isSPM 16:03 < tiraboschi_> MainThread::INFO::2016-02-25 16:06:46,151::upgrade::658::ovirt_hosted_engine_ha.lib.upgrade.StorageServer::(_isSPM) isSPM 16:03 < tiraboschi_> MainThread::INFO::2016-02-25 16:06:48,363::upgrade::658::ovirt_hosted_engine_ha.lib.upgrade.StorageServer::(_isSPM) isSPM 16:03 < tiraboschi_> MainThread::INFO::2016-02-25 16:06:50,494::upgrade::658::ovirt_hosted_engine_ha.lib.upgrade.StorageServer::(_isSPM) isSPM 16:04 < tiraboschi_> nsoffer: the upgrade procedure started to create the fake SD to detach our SD 16:05 < tiraboschi_> nsoffer: and the user rebooted the host exactly in the middle 16:05 < tiraboschi_> nsoffer: and now that SP is dirty 16:05 < nsoffer> tiraboschi_, so you have to clean it 16:05 < tiraboschi_> nsoffer: since the master SD was a SD on a loopback device that now doesn't exists 16:05 < tiraboschi_> nsoffer: exaclty 16:06 -!- mmirecki: has quit [Ping timeout: 244 seconds] 16:06 < tiraboschi_> nsoffer: thanks again 16:10 -!- shubhendu: has quit [Ping timeout: 276 seconds] 16:12 -!- rmohr: has quit [Quit: rmohr] 16:26 -!- fsimonce: has quit [Ping timeout: 240 seconds] 16:27 -!- apahim_: has quit [Ping timeout: 240 seconds] 16:27 -!- mzamazal: has quit [Ping timeout: 240 seconds] 16:27 -!- derez_: has quit [Quit: Leaving] 16:27 -!- derez_: has joined #vdsm 16:27 -!- derez_ is Daniel Erez 16:28 -!- phoracek: has quit [Quit: WeeChat 1.4] 16:29 -!- fsimonce (Federico): has joined #vdsm 16:32 -!- mmirecki (Marcin Mirecki): has joined #vdsm 16:39 -!- apahim_ (Amador Pahim): has joined #vdsm 16:42 -!- danken: has quit [Quit: Leaving.] 16:53 -!- derez_: has quit [Quit: Leaving] 17:03 -!- sbonazzo: has quit [Quit: Leaving.] 17:07 -!- nsoffer: has quit [Ping timeout: 248 seconds] 17:10 -!- gpadgett (Greg Padgett): has joined #vdsm 17:13 -!- pkliczew: has quit [Ping timeout: 248 seconds] 17:16 -!- #vdsm ybronhei: has quit [Ping timeout: 244 seconds] 17:17 -!- acanan_: has quit [Ping timeout: 244 seconds] 17:42 -!- edwardh: has quit [Ping timeout: 244 seconds] 17:43 -!- rmatinata (Ricardo Marin Matinata): has joined #vdsm 17:55 -!- shubhendu (Shubhendu Tripathi): has joined #vdsm 18:06 -!- ishaby: has quit [Ping timeout: 276 seconds] 18:08 -!- hchiramm: has quit [Ping timeout: 255 seconds] 18:18 -!- fromani: has quit [Remote host closed the connection] 18:19 -!- edwardh (purple): has joined #vdsm 18:20 -!- pkliczew (Piotr Kliczewski): has joined #vdsm 18:31 -!- edwardh: has quit [Ping timeout: 244 seconds] 19:24 -!- pkliczew: has quit [Ping timeout: 244 seconds] 19:33 -!- vered: has quit [Ping timeout: 244 seconds] 19:44 -!- osvoboda (osvoboda): has joined #vdsm 19:51 -!- edwardh (purple): has joined #vdsm 20:21 -!- hchiramm (hchiramm): has joined #vdsm 20:29 -!- pkliczew (Piotr Kliczewski): has joined #vdsm 20:31 -!- osvoboda: has quit [Ping timeout: 240 seconds] 20:48 -!- tim_ (Tim): has joined #vdsm 20:51 -!- osvoboda (osvoboda): has joined #vdsm 20:53 -!- shubhendu: has quit [Ping timeout: 244 seconds] 21:11 -!- pkliczew: has quit [Ping timeout: 250 seconds] 21:21 -!- tim_: has quit [Ping timeout: 240 seconds] 21:37 -!- osvoboda: has quit [Ping timeout: 240 seconds] 21:57 -!- rmatinata: has quit [Ping timeout: 250 seconds] 22:00 -!- osvoboda (osvoboda): has joined #vdsm 22:13 -!- mmirecki: has quit [Ping timeout: 244 seconds] 22:23 -!- apahim_: has quit [Ping timeout: 276 seconds] 22:38 -!- apahim_ (Amador Pahim): has joined #vdsm 23:13 -!- rmatinata (Ricardo Marin Matinata): has joined #vdsm 23:21 -!- amarchuk: has quit [Ping timeout: 276 seconds] 23:30 -!- osvoboda: has quit [Ping timeout: 240 seconds] 23:36 -!- edwardh: has quit [Ping timeout: 240 seconds] 23:55 -!- osvoboda (osvoboda): has joined #vdsm --- Log closed vr feb 26 00:00:09 2016