--- Log opened di mei 30 00:00:15 2017 00:32 -!- nsoffer: has quit [Ping timeout: 260 seconds] 00:32 -!- nsoffer (Nir Soffer): has joined #vdsm 00:45 -!- fsimonce: has quit [Quit: Coyote finally caught me] 02:37 -!- dougsland: has quit [Ping timeout: 260 seconds] 03:01 -!- nsoffer: has quit [Ping timeout: 240 seconds] 04:50 -!- Shu6h3ndu (Shubhendu): has joined #vdsm 07:00 -!- ndarshan (Darshan n): has joined #vdsm 07:04 -!- ishaby: has quit [Ping timeout: 240 seconds] 07:07 -!- humblec: has quit [Ping timeout: 255 seconds] 07:54 -!- fromani (Francesco Romani): has joined #vdsm 08:04 -!- fabiand (Fabian Deutsch): has joined #vdsm 08:11 -!- dholler (Dominik Holler): has joined #vdsm 08:11 -!- Humble (hchiramm): has joined #vdsm 08:27 -!- mskrivanek_away is now known as mskrivanek 08:31 -!- dchaplyg: has quit [Ping timeout: 246 seconds] 08:36 -!- mzamazal (Milan Zamazal): has joined #vdsm 08:37 -!- dholler: has quit [Ping timeout: 260 seconds] 08:43 -!- dchaplyg (Denis Chaplygin): has joined #vdsm 08:52 -!- mmirecki (Marcin Mirecki): has joined #vdsm 08:55 -!- dholler (Dominik Holler): has joined #vdsm 08:57 -!- tiraboschi_ (purple): has joined #vdsm 09:01 -!- dholler: has quit [Ping timeout: 272 seconds] 09:02 -!- dholler (Dominik Holler): has joined #vdsm 09:09 -!- mmirecki: has quit [Ping timeout: 240 seconds] 09:20 -!- ishaby: has quit [Ping timeout: 240 seconds] 09:24 -!- mmirecki (Marcin Mirecki): has joined #vdsm 09:42 -!- dchaplyg: has quit [Ping timeout: 268 seconds] 09:54 -!- fsimonce (Federico): has joined #vdsm 09:54 -!- dchaplyg (Denis Chaplygin): has joined #vdsm 10:13 -!- pkliczew (Piotr Kliczewski): has joined #vdsm 11:27 -!- Netsplit *.net <-> *.split quits: phoracek 11:31 -!- mode/#vdsm: by ChanServ 11:31 -!- msivak: has quit [Ping timeout: 240 seconds] 11:32 -!- msivak (Martin Sivak): has joined #vdsm 11:44 -!- phoracek (phoracek): has joined #vdsm 12:14 -!- msivak: has quit [Changing host] 12:14 -!- msivak (Martin Sivak): has joined #vdsm 12:14 -!- phoracek: has quit [Changing host] 12:14 -!- phoracek (phoracek): has joined #vdsm 12:25 -!- rmatinata: has quit [Quit: This computer has gone to sleep] 13:15 -!- rmatinata (Ricardo Marin Matinata): has joined #vdsm 13:25 -!- ybronhei (purple): has joined #vdsm 13:54 -!- dholler: has quit [Ping timeout: 240 seconds] 13:56 -!- dougsland (douglas): has joined #vdsm 13:56 -!- dougsland: has quit [Changing host] 13:56 -!- dougsland (douglas): has joined #vdsm 14:03 -!- mmirecki is now known as mmirecki_lunch 14:07 -!- dholler (Dominik Holler): has joined #vdsm 14:14 -!- rmohr (Roman Mohr): has joined #vdsm 14:14 -!- rmohr: has quit [Client Quit] 14:32 -!- rmohr (Roman Mohr): has joined #vdsm 14:38 -!- dougsland: has quit [Remote host closed the connection] 14:44 -!- mmirecki_lunch is now known as mmirecki 14:45 -!- rmohr is now known as rmohr_afk 15:07 -!- edwardh: has quit [Ping timeout: 240 seconds] 15:27 -!- edwardh: has quit [Ping timeout: 260 seconds] 15:38 -!- danken: has quit [Quit: Leaving.] 15:44 -!- ishaby: has quit [Ping timeout: 240 seconds] 15:46 -!- edwardh: has quit [Ping timeout: 260 seconds] 15:53 -!- ndarshan: has quit [Quit: Leaving] 16:05 -!- edwardh: has quit [Ping timeout: 245 seconds] 16:17 -!- mmirecki: has quit [Ping timeout: 245 seconds] 16:24 -!- edwardh: has quit [Ping timeout: 240 seconds] 16:34 -!- alitke (Adam Litke): has joined #vdsm 16:34 -!- nsoffer (Nir Soffer): has joined #vdsm 16:37 -!- Shu6h3ndu: has quit [Ping timeout: 245 seconds] 16:40 -!- Humble: has quit [Ping timeout: 268 seconds] 16:46 -!- dougsland (douglas): has joined #vdsm 16:54 -!- ishaby: has quit [Ping timeout: 255 seconds] 17:07 -!- mskrivanek is now known as mskrivanek_away 17:25 -!- Humble (hchiramm): has joined #vdsm 17:43 -!- mskrivanek_away is now known as mskrivanek 18:06 -!- pkliczew: has quit [Ping timeout: 255 seconds] 18:21 -!- mzamazal: has quit [Remote host closed the connection] 18:36 -!- ybronhei: has quit [Quit: Leaving.] 18:51 -!- mmirecki (Marcin Mirecki): has joined #vdsm 18:57 -!- mmirecki: has quit [Ping timeout: 246 seconds] 19:12 -!- mskrivanek is now known as mskrivanek_away 19:14 -!- mskrivanek_away is now known as mskrivanek 19:21 -!- mmirecki (Marcin Mirecki): has joined #vdsm 19:35 -!- mskrivanek is now known as mskrivanek_away 19:49 -!- mmirecki: has quit [Ping timeout: 260 seconds] 20:05 -!- mode/#vdsm: by ChanServ 20:25 -!- danken: has quit [Quit: Leaving.] 20:36 -!- rmatinata: has quit [Quit: This computer has gone to sleep] 20:56 < nsoffer> alitke, ping 20:57 < alitke> nsoffer, ponf 20:57 < alitke> pong 20:57 < nsoffer> hi 20:57 < nsoffer> about https://gerrit.ovirt.org/#/c/75008/11/lib/vdsm/storage/operation.py@81 20:57 < nsoffer> I don't think anyone will understand convey() 20:58 < nsoffer> watch() can work, it is like the old watchCmd 20:58 < alitke> perfect. 20:59 < nsoffer> but you can also say that watch is something yo do after you start the command yourself 21:00 < nsoffer> in one of the old versions, run_iter was suggested 21:01 < alitke> i don't like that one 21:01 < nsoffer> same here, but it describes what this method does 21:01 < alitke> watch is nice because it suggests that we have an opportunity to process output as it arrives. 21:02 < alitke> which is the key differentiation from run() 21:02 < nsoffer> ok, watch() is nicer than irun() 21:02 < alitke> great. 21:03 < nsoffer> so about the next patch - do you have suggestions how to split it? 21:03 < nsoffer> https://gerrit.ovirt.org/#/c/75009/ 21:07 < nsoffer> alitke, the changes in the callers cannot be split 21:08 < alitke> well maybe it cannot be split then. 21:08 < nsoffer> and most of the change is replacing lot of duplicate code with using the operation 21:08 < alitke> unless you want to carry both implementations in the class and convert a few call sites at a time. 21:08 < alitke> but that seems like overkill 21:09 < alitke> It's a large patch but with a consistent and repetitive logical change. 21:11 < nsoffer> maybe I can split the tests, I can introduce _update_progress() in the old code and change to tests to use it 21:11 < alitke> yeah, that could be a good idea. 21:12 < nsoffer> and maybe split the change in error - raise cmdutils.Error instead of qemuimg.QImgError 21:14 -!- rmatinata (Ricardo Marin Matinata): has joined #vdsm 21:14 < alitke> there ya go. Then that should leave all of the call sites to their own patch which would be easy to read. 21:25 < nsoffer> the last path will be both callers and impl, you cannot separate that 21:27 < nsoffer> alitke, regarding https://gerrit.ovirt.org/#/c/75030/, I want to test this better 21:28 < alitke> ok 21:28 < nsoffer> alitke, last time I tested the scsi device (/dev/sdxxx) directly instead of /dev/mapper/yyyyy 21:29 < nsoffer> alitke, do you think https://gerrit.ovirt.org/#/c/69782/ is safe to backport to 4.1? 21:29 < nsoffer> alitke, we have a bug about bad logging, and this adds good logging 21:30 < nsoffer> alitke, I can prepare another patch adding better logging to old code 21:30 < nsoffer> alitke, in 3 different places 21:30 < nsoffer> alitke, or backport this as is 21:31 < alitke> I would not backport a completely new module just to get better logging. 21:31 < nsoffer> alitke, new module is easy to backport, the issue is the big change in the old code 21:31 < nsoffer> new module is the safest change you can have 21:32 < alitke> my point is it's a lot of new code to add for just some logging. 21:32 < nsoffer> I agree, but also easier to maintain 21:32 < nsoffer> same code on 4.1 and 4.2 21:33 < nsoffer> if we have a bug, easier to fix 21:41 < alitke> The main purpose of that patch was not logging so to me it is out of scope. 21:45 < nsoffer> alitke, so add 4.1 only patch adding logging for the 3 places doing zero volume? 21:45 < alitke> Is there a bug against the logging in 4.1? 21:45 < nsoffer> alitke, sure, on this patch 21:46 < nsoffer> alitke, support use this log to know if a disk was zeroed 21:46 < nsoffer> alitke, we don't keep this state on the volume metadata 21:47 < nsoffer> the right solution is probably to mark a disk as zeroed after it was zeroed 21:50 < alitke> Well, if we have a mandate to fix this in 4.1 then I'll let you decide if you want to backport the whole thing. 21:52 < alitke> nsoffer, can you look at this: 21:52 < alitke> https://paste.fedoraproject.org/paste/X488b~OtA1uI3gxnRwOQSV5M1UNdIGYhyRLivL9gydE= 21:53 < alitke> We have a call to getAllTasksStatuses showing a task 21:53 < alitke> then engine calls clearTask and we report the task doesn't exist. 21:53 < alitke> Not sure how this can be,. 21:53 < nsoffer> is this the bug from Gordon? 21:53 < alitke> yes 21:55 < nsoffer> don't we have 2 calls to clear the same task? 21:55 < nsoffer> jsonrpc.Executor/0::DEBUG::2017-04-04 20:46:39,254::__init__::532::jsonrpc.JsonRpcServer::(_handle_request) Calling 'Task.clear' in bridge with {u'taskID': u'e544ea13-63ac-4f9b-952b-7398e3427b6e'} 21:55 < nsoffer> jsonrpc.Executor/7::DEBUG::2017-04-04 20:46:40,288::__init__::532::jsonrpc.JsonRpcServer::(_handle_request) Calling 'Task.clear' in bridge with {u'taskID': u'e544ea13-63ac-4f9b-952b-7398e3427b6e'} 21:56 < nsoffer> engine is calling twice? 21:57 < nsoffer> alitke, ^^^ 21:58 < alitke> hmm... 21:59 < alitke> indeed 22:00 < alitke> so looks like a race in engine. 22:00 < nsoffer> yea 22:01 < alitke> thanks. Don't know how I missed that. 22:01 < alitke> Reading those logs gives me a headache sometimes. 22:03 < nsoffer> alitke, I think I fixed the flaky test: https://gerrit.ovirt.org/#/c/77559/ 22:03 < nsoffer> alitke, run about 9000 iterations on travis and jenkins, no failure yet 22:04 < nsoffer> I cannot explain why stat(path).st_blocks does not return the right value 22:04 < nsoffer> but qemu-img check seems to be reliable 22:04 < nsoffer> maybe the kernel is trying to be smart about updating file metadata? 22:05 < nsoffer> while qemu is using the actual data in the file 22:06 < alitke> ahh, right. If the writes are still in the page cache pending writeback 22:07 < alitke> you could probably force that value to update by using fsync 22:08 -!- tiraboschi_: has left #vdsm ["QUIT :Leaving."] 22:09 < nsoffer> alitke, so the metadata changes are pending? 22:10 < alitke> the pattern writes are not on disk 22:10 < alitke> the test works because the data is read through the page cache 22:10 < alitke> but the st_blocks value is part of the fs and the writes haven't flushed there yet. 22:11 < nsoffer> the interesting part is that the test succeeds in 99.99% of times 22:11 < alitke> yep, that's because the kernel is writing back buffers to disk 22:11 < nsoffer> so qemu is probably using fsync or fdatasync 22:11 < alitke> and on powerful systems this probably happens quickly enough 22:11 < nsoffer> otherwise the write can wait for many seconds 22:12 < alitke> but on io loaded systems it could take some more time. 22:12 < alitke> qemuimg check probably runs an internal fsync on the file. 22:13 < nsoffer> anyway using qemu-img check seems more correct, we don't care about actual storage, only about the data in the image 22:13 < alitke> right. 22:16 < nsoffer> alitke, when you have time, please look at https://gerrit.ovirt.org/68822 22:16 < nsoffer> this is an interesting way to get better performance in hyper-converge setup 22:17 < nsoffer> alitke, so you can have both shared storage and local in the same setup 22:17 < alitke> ok. tomorrow. 22:18 < nsoffer> sure, see you later this week 22:23 < alitke> See you. Enjoy the holidays! 22:49 -!- dougsland: has quit [Ping timeout: 268 seconds] 23:12 -!- dholler: has quit [Ping timeout: 260 seconds] 23:24 -!- nsoffer: has quit [Quit: Segmentation fault (core dumped)] --- Log closed wo mei 31 00:00:16 2017