bugfix: S3C-2006 fix crash in vault-md on listing

Listed entries may either be objects { value[, key] } or plain strings for key-only listings. Fix this by checking the filtered entry type before calling trimMetadata().
S3C-1985: heuristic to limit metadata trim to large blob
2019-03-12 17:15:08 +00:00 · 2019-02-20 17:06:20 +00:00 · 2019-02-20 17:06:10 +00:00 · 2019-02-20 17:05:58 +00:00 · 2018-11-28 12:11:40 +00:00
8 changed files with 133 additions and 30 deletions
--- a/lib/algos/list/Extension.js
+++ b/lib/algos/list/Extension.js
@ -2,6 +2,23 @@
 const { FILTER_SKIP, SKIP_NONE } = require('./tools');
 // Use a heuristic to amortize the cost of JSON
 // serialization/deserialization only on largest metadata where the
 // potential for size reduction is high, considering the bulk of the
 // blob size is due to the "location" field containing a large number
 // of MPU parts.
 //
 // Measured on some standard metadata:
 // - 100 parts -> 9K blob
 // - 2000 parts -> 170K blob
 //
 // Using a 10K threshold should lead to a worst case of about 10M to
 // store a raw listing of 1000 entries, even with some growth
 // multiplication factor due to some internal memory duplication, it
 // should stay within reasonable memory limits.
 const TRIM_METADATA_MIN_BLOB_SIZE = 10000;
 /**
 *  Base class of listing extensions.
 */
@ -23,6 +40,38 @@ class Extension {
        this.keys = 0;
    }
    /**
     * Filters-out non-requested optional fields from the value. This function
     * shall be applied on any value that is to be returned as part of the
     * result of a listing extension.
     *
     * @param {String} value - The JSON value of a listing item
     *
     * @return {String} The value that may have been trimmed of some
     * heavy unused fields, or left untouched (depending on size
     * heuristics)
     */
    trimMetadata(value) {
        let ret = undefined;
        if (value.length >= TRIM_METADATA_MIN_BLOB_SIZE) {
            try {
                ret = JSON.parse(value);
                delete ret.location;
                ret = JSON.stringify(ret);
            } catch (e) {
                // Prefer returning an unfiltered data rather than
                // stopping the service in case of parsing failure.
                // The risk of this approach is a potential
                // reproduction of MD-692, where too much memory is
                // used by repd.
                this.logger.warn(
                    'Could not parse Object Metadata while listing',
                    { err: e.toString() });
            }
        }
        return ret || value;
    }
    /**
     * Generates listing parameters that metadata can understand from the input
     * parameters. What metadata can understand: gt, gte, lt, lte, limit, keys,
--- a/lib/algos/list/basic.js
+++ b/lib/algos/list/basic.js
@ -56,7 +56,14 @@ class List extends Extension {
        if (this.keys >= this.maxKeys) {
            return FILTER_END;
        }
        if (typeof elem === 'object') {
            this.res.push({
                key: elem.key,
                value: this.trimMetadata(elem.value),
            });
        } else {
            this.res.push(elem);
        }
        this.keys++;
        return FILTER_ACCEPT;
    }
--- a/lib/algos/list/delimiter.js
+++ b/lib/algos/list/delimiter.js
@ -60,9 +60,11 @@ class Delimiter extends Extension {
     * @param {Boolean} [parameters.alphabeticalOrder] - Either the result is
     *                                                   alphabetically ordered
     *                                                   or not
     * @param {RequestLogger} logger                   - The logger of the
     *                                                   request
     */
-    constructor(parameters) {
+    constructor(parameters, logger) {
-        super(parameters);
+        super(parameters, logger);
        // original listing parameters
        this.delimiter = parameters.delimiter;
        this.prefix = parameters.prefix;
@ -142,7 +144,7 @@ class Delimiter extends Extension {
        if (this._reachedMaxKeys()) {
            return FILTER_END;
        }
-        this.Contents.push({ key, value });
+        this.Contents.push({ key, value: this.trimMetadata(value) });
        this[this.nextContinueMarker] = key;
        ++this.keys;
        return FILTER_ACCEPT;
--- a/lib/algos/list/delimiterMaster.js
+++ b/lib/algos/list/delimiterMaster.js
@ -22,9 +22,10 @@ class DelimiterMaster extends Delimiter {
     * @param {Boolean} parameters.v2         - indicates whether v2 format
     * @param {String}  parameters.startAfter - marker per amazon v2 format
     * @param {String}  parameters.continuationToken - obfuscated amazon token
     * @param {RequestLogger} logger          - The logger of the request
     */
-    constructor(parameters) {
+    constructor(parameters, logger) {
-        super(parameters);
+        super(parameters, logger);
        // non-PHD master version or a version whose master is a PHD version
        this.prvKey = undefined;
        this.prvPHDKey = undefined;
--- a/lib/algos/list/delimiterVersions.js
+++ b/lib/algos/list/delimiterVersions.js
@ -25,8 +25,8 @@ function formatVersionKey(key, versionId) {
 * @prop {Number} maxKeys              - number of keys to list
 */
 class DelimiterVersions extends Delimiter {
-    constructor(parameters) {
+    constructor(parameters, logger) {
-        super(parameters);
+        super(parameters, logger);
        // specific to version listing
        this.keyMarker = parameters.keyMarker;
        this.versionIdMarker = parameters.versionIdMarker;
@ -75,7 +75,11 @@ class DelimiterVersions extends Delimiter {
        if (this._reachedMaxKeys()) {
            return FILTER_END;
        }
-        this.Contents.push(obj);
+        this.Contents.push({
            key: obj.key,
            value: this.trimMetadata(obj.value),
            versionId: obj.versionId,
        });
        this.NextMarker = obj.key;
        this.NextVersionIdMarker = obj.versionId;
        ++this.keys;
--- a/package.json
+++ b/package.json
@ -33,7 +33,7 @@
    "socket.io-client": "~1.7.3",
    "utf8": "2.1.2",
    "uuid": "^3.0.1",
-    "werelogs": "scality/werelogs#0ff7ec82",
+    "werelogs": "scality/werelogs#hotfix/7.4.2",
    "xml2js": "~0.4.16"
  },
  "optionalDependencies": {
@ -43,7 +43,7 @@
    "eslint": "2.13.1",
    "eslint-plugin-react": "^4.3.0",
    "eslint-config-airbnb": "6.2.0",
-    "eslint-config-scality": "scality/Guidelines#71a059ad",
+    "eslint-config-scality": "scality/Guidelines#hotfix/7.4.2",
    "lolex": "1.5.2",
    "mocha": "2.5.3",
    "temp": "0.8.3"
--- a/tests/unit/algos/list/basic.js
+++ b/tests/unit/algos/list/basic.js
@ -17,9 +17,11 @@ class Test {
 describe('Basic listing algorithm', () => {
    const data = [];
    for (let i = 0; i < 15000; ++i) {
        // Following the fix for S3C-1985, data is set as a stringified JSON
        // object, so that the test does not keep logging warnings.
        data.push({
            key: `key${i}`,
-            value: `value${i}`,
+            value: `{"data":"value${i}"}`,
        });
    }
    const tests = [
@ -38,4 +40,29 @@ describe('Basic listing algorithm', () => {
            done();
        });
    });
    it('Should support entries with no key', () => {
        const res1 = performListing([{
            value: '{"data":"foo"}',
        }], Basic, { maxKeys: 1 }, logger);
        assert.deepStrictEqual(res1, [{
            key: undefined,
            value: '{"data":"foo"}',
        }]);
        const res2 = performListing([{
            key: undefined,
            value: '{"data":"foo"}',
        }], Basic, { maxKeys: 1 }, logger);
        assert.deepStrictEqual(res2, [{
            key: undefined,
            value: '{"data":"foo"}',
        }]);
    });
    it('Should support key-only listing', () => {
        const res = performListing(['key1', 'key2'],
                                   Basic, { maxKeys: 1 }, logger);
        assert.deepStrictEqual(res, ['key1']);
    });
 });
--- a/tests/unit/algos/list/delimiterMaster.js
+++ b/tests/unit/algos/list/delimiterMaster.js
@ -24,11 +24,19 @@ const EmptyResult = {
    Delimiter: undefined,
 };
 const fakeLogger = {
    trace: () => {},
    debug: () => {},
    info: () => {},
    warn: () => {},
    error: () => {},
    fatal: () => {},
 };
 describe('Delimiter All masters listing algorithm', () => {
    it('should return SKIP_NONE for DelimiterMaster when both NextMarker ' +
       'and NextContinuationToken are undefined', () => {
-        const delimiter = new DelimiterMaster({ delimiter: '/' });
+        const delimiter = new DelimiterMaster({ delimiter: '/' }, fakeLogger);
        assert.strictEqual(delimiter.NextMarker, undefined);
@ -40,7 +48,8 @@ describe('Delimiter All masters listing algorithm', () => {
    it('should return <key><VersionIdSeparator> for DelimiterMaster when ' +
       'NextMarker is set and there is a delimiter', () => {
        const key = 'key';
-        const delimiter = new DelimiterMaster({ delimiter: '/', marker: key });
+        const delimiter = new DelimiterMaster({ delimiter: '/', marker: key },
                                             fakeLogger);
        /* Filter a master version to set NextMarker. */
        // TODO: useless once S3C-1628 is fixed.
@ -57,7 +66,8 @@ describe('Delimiter All masters listing algorithm', () => {
       'NextContinuationToken is set and there is a delimiter', () => {
        const key = 'key';
        const delimiter = new DelimiterMaster(
-            { delimiter: '/', startAfter: key, v2: true });
+            { delimiter: '/', startAfter: key, v2: true },
            fakeLogger);
        // Filter a master version to set NextContinuationToken
        delimiter.filter({ key, value: '' });
@ -73,7 +83,7 @@ describe('Delimiter All masters listing algorithm', () => {
        const delimiter = new DelimiterMaster({
            delimiter: delimiterChar,
            marker: keyWithEndingDelimiter,
-        });
+        }, fakeLogger);
        /* When a delimiter is set and the NextMarker ends with the
         * delimiter it should return the next marker value. */
@ -82,7 +92,7 @@ describe('Delimiter All masters listing algorithm', () => {
    });
    it('should skip entries not starting with prefix', () => {
-        const delimiter = new DelimiterMaster({ prefix: 'prefix' });
+        const delimiter = new DelimiterMaster({ prefix: 'prefix' }, fakeLogger);
        assert.strictEqual(delimiter.filter({ key: 'wrong' }), FILTER_SKIP);
        assert.strictEqual(delimiter.NextMarker, undefined);
@ -91,7 +101,7 @@ describe('Delimiter All masters listing algorithm', () => {
    });
    it('should skip entries superior to next marker', () => {
-        const delimiter = new DelimiterMaster({ marker: 'b' });
+        const delimiter = new DelimiterMaster({ marker: 'b' }, fakeLogger);
        assert.strictEqual(delimiter.filter({ key: 'a' }), FILTER_SKIP);
        assert.strictEqual(delimiter.NextMarker, 'b');
@ -100,7 +110,7 @@ describe('Delimiter All masters listing algorithm', () => {
    });
    it('should accept a master version', () => {
-        const delimiter = new DelimiterMaster({});
+        const delimiter = new DelimiterMaster({}, fakeLogger);
        const key = 'key';
        const value = '';
@ -117,7 +127,7 @@ describe('Delimiter All masters listing algorithm', () => {
    });
    it('should accept a PHD version as first input', () => {
-        const delimiter = new DelimiterMaster({});
+        const delimiter = new DelimiterMaster({}, fakeLogger);
        const keyPHD = 'keyPHD';
        const objPHD = {
            key: keyPHD,
@ -134,7 +144,7 @@ describe('Delimiter All masters listing algorithm', () => {
    });
    it('should accept a PHD version', () => {
-        const delimiter = new DelimiterMaster({});
+        const delimiter = new DelimiterMaster({}, fakeLogger);
        const key = 'keyA';
        const value = '';
        const keyPHD = 'keyBPHD';
@ -163,7 +173,7 @@ describe('Delimiter All masters listing algorithm', () => {
    });
    it('should accept a version after a PHD', () => {
-        const delimiter = new DelimiterMaster({});
+        const delimiter = new DelimiterMaster({}, fakeLogger);
        const masterKey = 'key';
        const keyVersion = `${masterKey}${VID_SEP}version`;
        const value = '';
@ -193,7 +203,7 @@ describe('Delimiter All masters listing algorithm', () => {
    });
    it('should accept a delete marker', () => {
-        const delimiter = new DelimiterMaster({});
+        const delimiter = new DelimiterMaster({}, fakeLogger);
        const version = new Version({ isDeleteMarker: true });
        const key = 'key';
        const obj = {
@ -210,7 +220,7 @@ describe('Delimiter All masters listing algorithm', () => {
    });
    it('should skip version after a delete marker', () => {
-        const delimiter = new DelimiterMaster({});
+        const delimiter = new DelimiterMaster({}, fakeLogger);
        const version = new Version({ isDeleteMarker: true });
        const key = 'key';
        const versionKey = `${key}${VID_SEP}version`;
@ -226,7 +236,7 @@ describe('Delimiter All masters listing algorithm', () => {
    });
    it('should accept a new key after a delete marker', () => {
-        const delimiter = new DelimiterMaster({});
+        const delimiter = new DelimiterMaster({}, fakeLogger);
        const version = new Version({ isDeleteMarker: true });
        const key1 = 'key1';
        const key2 = 'key2';
@ -249,7 +259,7 @@ describe('Delimiter All masters listing algorithm', () => {
    });
    it('should accept the master version and skip the other ones', () => {
-        const delimiter = new DelimiterMaster({});
+        const delimiter = new DelimiterMaster({}, fakeLogger);
        const masterKey = 'key';
        const masterValue = 'value';
        const versionKey = `${masterKey}${VID_SEP}version`;
@ -277,7 +287,7 @@ describe('Delimiter All masters listing algorithm', () => {
    });
    it('should return good listing result for version', () => {
-        const delimiter = new DelimiterMaster({});
+        const delimiter = new DelimiterMaster({}, fakeLogger);
        const masterKey = 'key';
        const versionKey1 = `${masterKey}${VID_SEP}version1`;
        const versionKey2 = `${masterKey}${VID_SEP}version2`;
@ -320,7 +330,8 @@ describe('Delimiter All masters listing algorithm', () => {
           const prefix2Key1 = `${commonPrefix2}key1`;
           const value = 'value';
-           const delimiter = new DelimiterMaster({ delimiter: delimiterChar });
+           const delimiter = new DelimiterMaster({ delimiter: delimiterChar },
                                                 fakeLogger);
           /* Filter the first entry with a common prefix. It should be
            * accepted and added to the result. */
@ -373,7 +384,8 @@ describe('Delimiter All masters listing algorithm', () => {
        const prefix2VersionKey1 = `${commonPrefix2}key1${VID_SEP}version`;
        const value = 'value';
-        const delimiter = new DelimiterMaster({ delimiter: delimiterChar });
+        const delimiter = new DelimiterMaster({ delimiter: delimiterChar },
                                              fakeLogger);
        /* Filter the two first entries with the same common prefix to add
         * it to the result and reach the state where an entry is skipped
@ -404,7 +416,8 @@ describe('Delimiter All masters listing algorithm', () => {
        const key = `${commonPrefix}key${VID_SEP}version`;
        const value = 'value';
-        const delimiter = new DelimiterMaster({ delimiter: delimiterChar });
+        const delimiter = new DelimiterMaster({ delimiter: delimiterChar },
                                              fakeLogger);
        /* TODO: should be set to a whole key instead of just a common prefix
         * once ZENKO-1048 is fixed. */
        delimiter.NextMarker = commonPrefix;
Author	SHA1	Message	Date
Jonathan Gramain	6f3eefd489	bugfix: S3C-2006 fix crash in vault-md on listing Listed entries may either be objects { value[, key] } or plain strings for key-only listings. Fix this by checking the filtered entry type before calling trimMetadata().	2019-03-12 17:15:08 +00:00
Jonathan Gramain	460b499d9e	S3C-1985: heuristic to limit metadata trim to large blob Add a heuristic to only do the trim for large metadata blob of more than 10KB. This should limit the number of JSON (de)serializations to only those blobs where we can hope a significant reduction in size. Also renamed "filterContent" to "trimMetadata"	2019-02-20 17:06:20 +00:00
David Pineau	c70f079558	S3C-1985: Fix tests after memory consumption fix In the associated issues's memory consumption fix, a warning is logged in case of unparseable entry from the listing. This broke the unit tests for some specific listing algorithms, as they had never taken any logger as constructor parameter, as opposed to their base class. Taking a look at the usage of these classes in the known client code (CloudServer & Metadata), the logger was actually always provided, which means that we could make use of it, and accept this forgotten parameter. Fixes tests for S3C-1985, Fixes silent prototype usage mismatch.	2019-02-20 17:06:10 +00:00
David Pineau	c8a9897315	S3C-1985: Reduce memory usage of listing algorithms The issue discovered in the field was that an key with a heavy amount of data could lead to an exhaustion of the available memory for Node.JS javascript code. In the case of the buckets handling, that would be MPU objects with an important number of parts, listed in the "location" field. A customer case revealed and triggered this corner case, loop-crashing the processes responsible for the listing of the database, which were using the Arsenal.algos.list.* extensions. That issue is hereby fixed by this commit through the trimming of the known heavy unused data: the "location" field of the bucket's values. Note that as the code previously didn't need to parse the value before forwarding it back to the caller, we are now parsing it, removing the unwanted fields, and re-stringifying it for each selected entry in the listing. A notable impact of this change is that the CPU usage might go up substantially while listing the contents of any bucket. Additionally, and for safety purposes, if the data cannot be parsed and alterated, it will be returned as-is, at the risk of maintaining the memory consuming behavior in the case of that corner case; while a warning is logged on stdout. Fixes S3C-1985	2019-02-20 17:05:58 +00:00
Stephane Cance	7c16b3cfc0	hotfix/7.4.2 bump	2018-11-28 12:11:40 +00:00