Commit Graph

127 Commits (e20b48790455d4db085c38d7e3a142228e6fbd24)

Author SHA1 Message Date
Xiang Li f59da0e453 *:fix point-in-time backup
Backup process should be able to read all WALs until io.EOF to
generate a point-in-time backup.

Our WAL file is append-only. And the backup process will lock all
files before start reading, which can prevent the gc routine from
removing any files in the middle.
2015-06-15 11:12:28 -07:00
Xiang Li 2373fd8426 wal: fix the left logging using default log 2015-06-11 14:22:14 -07:00
Xiang Li 711451ce2d *: rename logger to plog 2015-06-02 14:58:24 -07:00
Xiang Li 185d2bced4 wal: use leveled logger 2015-06-01 13:38:50 -07:00
Xiang Li 34ac145b38 *: use namespace and subsystem in metrics
Fix #2841.

From Prometheus developer:
```
the recommended way for etcd as an open source project and under
consideration of its size would be etcd_<subsystem>_<name>.
```

We made the naming change accordingly.
2015-05-26 14:39:04 -07:00
Xiang Li d1d7feacc9 wal: change io.EOF returned by readFull to io.ErrUnexpectedEOF
Decoder should return error for any broken block including the
one that only contains the length field. We should change io.EOF
to io.ErrUnexpectedEOF before return the error.
2015-04-23 09:53:36 -07:00
Xiang Li aed18395c9 wal: report throughput in wal bench 2015-04-12 21:35:08 -07:00
Xiang Li 89242d4659 wal: better log msg 2015-04-09 09:54:20 -07:00
Xiang Li 53792ccbdc wal: never leave a corrupted wal file
If the process dies during wal.cut(), it might leave a corrupted wal
file. This commit solves the problem by creating a temp wal file first,
then atomically rename it to a wal file when we are sure it is vaild.
2015-04-08 15:57:20 -07:00
Yicheng Qin 44de670de7 wal: allow at most one WAL function called at one time
SaveSnap and Save are called in separate goroutines now. Allow at most
one WAL function being called at one time to protect internal fields and
guarantee execution order.
Or one possible bug is that the new cut file is started with snapshot
entry instead of crc entry.
2015-04-08 00:34:30 -07:00
Xiang Li 8bcaa2bfdf wal: reduce allocation when encoding int64 2015-03-30 20:41:31 -07:00
Xiang Li c32cca3a4f wal: reduce allocation when encoding entries 2015-03-30 19:20:46 -07:00
Xiang Li 3e9a033cd2 wal: repair decoder needs to update its crc 2015-03-30 13:45:23 -07:00
Xiang Li 684ebd95ae wal: backup broken wal before repairing 2015-03-29 15:42:59 -07:00
Xiang Li 8b4eed29e5 wal: fix the unexpectedEOF error in the last wal.
It is safe to repair the unexpectedEOF error in the last wal. raft
will not send out message before the entry successfully comitted
into wal. Thus we can safely truncate the last entry in the wal
to repair.
2015-03-28 21:08:14 -07:00
Xiang Li 05e240b892 *: update protobuf 2015-03-25 10:14:35 -07:00
Yicheng Qin 3dd6e0b88f wal: fix missing import 2015-03-24 22:53:15 -07:00
Xiang Li 6e6669d696 wal: releastTo should work with large release index 2015-03-24 22:34:26 -07:00
Yicheng Qin 5e0077cc0c etcdserver: print out extra files in data dir instead of erroring 2015-03-24 18:56:22 -07:00
Xiang Li b66eb3d81c wal: fix ReleaseLockTo
ReleaseLockTo should not release the lock on the WAL
segment that is right before the given index. When
restarting etcd, etcd needs to read from the WAL segment
that has a smaller index than the snapshot index.

The correct behavior is that ReleaseLockTo releases
the locks w is holding so that w only holds one lock
that has an index smaller than the given index.
2015-03-09 19:52:54 -07:00
Xiang Li ab72c3ec88 wal: do not race reader and writer 2015-03-05 20:19:17 -08:00
Xiang Li 86429264fb wal: support auto-cut in wal
WAL should control the cut logic itself. We want to do falloc to
per allocate the space for a segmented wal file at the beginning
and cut it when it size reaches the limit.
2015-02-28 11:18:59 -08:00
Xiang Li 84485643fe *: expose wal metrics at /metrics 2015-02-28 11:06:11 -08:00
Xiang Li fb1a28c65d *: vendor prometheus 2015-02-28 11:06:11 -08:00
Xiang Li 9e63b1fb63 wal: record metrics 2015-02-28 10:12:35 -08:00
Yicheng Qin 3fd9136740 migrate/starter: fix v2 data dir checking 2015-02-24 11:47:56 -08:00
Barak Michener 92dca0af0f *: remove shadowing of variables from etcd and add travis test
We've been bitten by this enough times that I wrote a tool so that
it never happens again.
2015-02-17 16:31:42 -05:00
Barak Michener fade9b6065 etcdserver: Refactor 2.0.1 directory rename into a proper migration
fix all instances

fix detection test
2015-02-12 11:53:19 -05:00
Yicheng Qin e966e565c4 etcdctl/backup_command: handle datadir with missed snapshot mark
This helps to recover from the data dir created in v2.0.0-rc1.
2015-01-29 13:32:59 -08:00
Jonathan Boulle f1ed69e883 *: switch to line comments for copyright
Build tags are not compatible with block comments.
Also adds copyright header to a few places it was missing.
2015-01-26 09:53:30 -08:00
Yicheng Qin 05e591f805 wal: remove unused encoder.buffered func 2015-01-09 14:59:46 -08:00
Yicheng Qin 9bdc343b7c wal: add ReleaseLockTo test 2015-01-09 14:59:41 -08:00
Yicheng Qin 270e67db84 wal: not export unnecessary public functions 2015-01-09 14:55:10 -08:00
Yicheng Qin 50c179ec1c wal: add DetectVersion test 2015-01-09 14:55:05 -08:00
Yicheng Qin f08d1090d0 wal: refine parseWalName function
According to http://godoc.org/fmt#Scan, if scan number is less than the
number of arguments, err will report why. So we don't need to handle
this error case.
2015-01-08 14:56:21 -08:00
Yicheng Qin 9532810f76 wal: remove unused max function 2015-01-08 14:49:14 -08:00
Yicheng Qin 6460e49a33 wal: save empty snapshot when create
So caller can open at empty snapshot to read all entries.
2015-01-06 19:48:21 -08:00
Yicheng Qin 78bb207bac wal: update doc about snapshot 2015-01-06 19:33:57 -08:00
Yicheng Qin 84f62f21ee wal: record and check snapshot 2015-01-06 16:27:40 -08:00
Jonathan Boulle 1ec98cb795 pkg/fileutil: sort filenames during ReadDir 2014-12-18 16:36:11 -08:00
Xiang Li 502396edd5 wal: fix wal doc 2014-12-14 19:36:37 -08:00
Xiang Li 53bf7e4b5e wal: rename openAtIndex -> open; OpenAtIndexUntilUsing -> openNotInUse 2014-12-14 19:33:06 -08:00
Xiang Li f538cba272 *: do not backup files still in use 2014-12-14 19:27:22 -08:00
Xiang Li ea94d19147 *: lock the in using files; do not purge locked the wal files 2014-12-14 19:27:22 -08:00
Barak Michener cf7690cb51 detect more cases of empty directories and actual errors 2014-12-11 13:37:32 -05:00
Barak Michener 421fe128c3 Return Unknown instead of NotExist
Unless the data dir truly does not exist.
2014-12-11 13:09:50 -05:00
Yicheng Qin b9bf957c6d wal: sync after writing data to disk in Cut function 2014-12-04 22:56:34 -08:00
Yicheng Qin af0f34c595 wal: save latest state into new WAL
So we could always read out state when open at valid index.
2014-12-04 12:19:21 -08:00
Yicheng Qin aa61009560 wal: not return ErrIndexNotFound in ReadAll
This IndexNotFound case is reasonable now because we don't write dummy
entries into wals any more.
2014-12-02 00:28:54 -08:00
Xiang Li d3db010190 *: support purging old wal/snap files 2014-12-01 11:50:17 -08:00