Commit Graph

684 Commits (879ecfa74db8a7e51c414017b39ba541bbe70886)
 

Author SHA1 Message Date
Vitaliy Filippov 879ecfa74d Fix wording 2 years ago
Vitaliy Filippov aea2d19d35 Change Telegram chat link 2 years ago
Vitaliy Filippov 04f86dc00b Fix Russian README for CMake build 2 years ago
Vitaliy Filippov 7aeb2cbac7 Capture all by value in qemu_proxy 2 years ago
Vitaliy Filippov 519f081006 Add LICENSE 2 years ago
Vitaliy Filippov e50f703e1d Add Russian version of the README 2 years ago
Vitaliy Filippov 2612d3198a Introduce image names and metadata storage in etcd
Each inode has: image name, parent inode number & pool, size and readonly flag

Snapshots are created by switching image name to a different inode number
while using the older inode as parent.
2 years ago
Vitaliy Filippov ab39ce2bbb Use clean_entry_bitmap_size instead of entry_attr_size back because of changed bitmap handling 2 years ago
Vitaliy Filippov d0c2e31312 Add a test for snapshots, fix bugs. Now the test passes 2 years ago
Vitaliy Filippov 9038d42327 Fix several snapshot I/O bugs 2 years ago
Vitaliy Filippov 691f066055 Actual snapshot support (untested) 2 years ago
Vitaliy Filippov ffe1cd4c79 Report inode I/O statistics, aggregate it in the monitor 2 years ago
Vitaliy Filippov 4ae1b84c67 Report inode space usage statistics to etcd, aggregate it in the monitor 2 years ago
Vitaliy Filippov c35963967f Add inode space usage statistics tracking to blockstore 2 years ago
Vitaliy Filippov 0aa2dd2890 Send bitmaps with primary-reads, actually read bitmaps for READ ops 2 years ago
Vitaliy Filippov 6bf88883ac Allocate bitmaps along with stripes to avoid memory fragmentation 2 years ago
Vitaliy Filippov 004f265393 Remove cryptic bitmap inlining from bs_op_t and osd_op_t, use bitmap in primary OSD code 2 years ago
Vitaliy Filippov 860ac24762 Add "external" bitmap support to the secondary OSD protocol 2 years ago
Vitaliy Filippov 6107a4d07b Add "external" bitmap support to blockstore 2 years ago
Vitaliy Filippov 95c29b9dc3 Add "external" bitmap support to osd_rmw 2 years ago
Vitaliy Filippov d99407dcec Check QEMU block-vitastor.so during the test 2 years ago
Vitaliy Filippov 6909807068 Allow to start the OSD just to flush the journal completely 2 years ago
Vitaliy Filippov ec90fe6ec1 Release 0.5.13
Another followup to 0.5.11
2 years ago
Vitaliy Filippov 18c72f4835 Correct reenterability fix (now verified with a test)
It's rather funny but 0.5.12 has to be re-published again
2 years ago
Vitaliy Filippov 59fbcef734 Release 0.5.12
Fix qemu driver broken in 0.5.11 :)
2 years ago
Vitaliy Filippov 40b7c21fb1 Followup to 307c1731c1 - fix mark_stable 2 years ago
Vitaliy Filippov efb3678606 Fix qemu-img broken in 0.5.11
Caused by the lack of reenterability of the main cluster_client function
2 years ago
Vitaliy Filippov 462650134e Release 0.5.11
Another bunch of fixes, including important ones. Now OSDs are stable in SSD+HDD
configurations and everything is mostly ready for the merge of master branch.

Features:

- Add min_flusher_count configuration (good for HDDs)
- Shuffle PGs for better data device utilisation
- Make OSDs benefit from the immediate_commit=small setting if it's applicable

Bug fixes:

- Rework client code to fix write ordering during operation replay
- Rework error handling code so OSDs don't crash in reaction to a crash of their peer OSDs
- Fix several block layer problems related to the journal, some of which
  were leading to double allocations of the same block during journal replay
- Fix monitors crashing during the removal of OSD keys from etcd
- Fix data fsyncs being incorrectly disabled when only disable_journal_fsync was set
- Always zero out unused part of request/reply headers
- Fix some theoretically possible read/write ordering issues
- Don't try to "recover" misplaced objects if it would make them degraded
- Fix heartbeats sometimes preventing OSD to establish connections
2 years ago
Vitaliy Filippov 8d87e32175 Fix msgr_op.h includes 2 years ago
Vitaliy Filippov b0b2e7df3c Fix use-after-free in keepalive_timer and rework stop_client()
The bug reproduced if fio was temporarily stopped with SIGSTOP
during write test and then resumed after 10 seconds. In this case
"pings" were failed for all clients and fio process crashed with
'use-after-free' in keepalive_timer. It happened because it called
stop_client while having a live iterator to the map.
2 years ago
Vitaliy Filippov 97efb9e299 Do not crash on PG re-peering events when operations are in progress 2 years ago
Vitaliy Filippov f6d705383a Fix client connection recovery bugs, add dirty_ops limit 2 years ago
Vitaliy Filippov 68567c0e1f Fix messenger possibly trying to connect to the same OSD twice 2 years ago
Vitaliy Filippov 04b00003e9 Log ping failures 2 years ago
Vitaliy Filippov 307c1731c1 Forget all dirty_entries before stable big_write or delete during initialisation
This fixes a 'double_alloc' assertion in the following case:
- big_write object #1 v1 to block #100
- big_write object #1 v2 to block #101
- big_write object #2 v1 to block #100
2 years ago
Vitaliy Filippov 75a6a556b5 Shuffle PGs for better data device utilisation 2 years ago
Vitaliy Filippov a48e2bbf18 Fix write replay ordering when immediate_commit != all
Previous implementation didn't respect write ordering and could lead
to corrupted data when restarting writes after an OSD outage

Also rework cluster_client queueing logic and add tests for it to verify the correct behaviour
2 years ago
Vitaliy Filippov 688821665a Remove stoull_full() from etcd_state_client.cpp 2 years ago
Vitaliy Filippov 3e162d95a0 Remove http_client.h include from etcd_state_client.h 2 years ago
Vitaliy Filippov 829381b335 Extract some definitions to msgr_op.{cpp,h} 2 years ago
Vitaliy Filippov 54f2353f24 Use bitmap granularity for alignment checks 2 years ago
Vitaliy Filippov e47f6fba60 Remove cluster_client_t::stop() 2 years ago
Vitaliy Filippov 883bf84a16 Fix build 2 years ago
Vitaliy Filippov 52097c4856 Stop flushing when less than min_flusher_count operations are available (unless a trim is forced) 2 years ago
Vitaliy Filippov e1355cbc74 Report failed operation name in cluster_client 2 years ago
Vitaliy Filippov 8f8b90be7a Add min_flusher_count configuration 2 years ago
Vitaliy Filippov ad9f619370 Skip double allocs when reading journal 2 years ago
Vitaliy Filippov f4769ba7c7 Collapse create+delete journal entry pairs if they're already flushed
Old journal replay mechanism could lead to a double allocation of the same
block and a "Fatal error: tried to overwrite non-zero metadata entry"
2 years ago
Vitaliy Filippov 843b7052d2 Add an assertion when clearing deleted metadata entries, add debug details when freeing blocks 2 years ago
Vitaliy Filippov df99e232ee Deduplicate osd_sets in pg history + raise request size limit for etcd 2 years ago