[go: nahoru, domu]

Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

An unusually large amount of memory usage #19

Closed
pflannery opened this issue Mar 18, 2015 · 27 comments
Closed

An unusually large amount of memory usage #19

pflannery opened this issue Mar 18, 2015 · 27 comments

Comments

@pflannery
Copy link
Contributor

When creating 10000 small tasks I find that the process memory peeks around 170MB. When I run the same 10000 tasks with setImmediate it's doesn't go over 40MB.
So its seems that task group is generating an unusually large amount of memory overhead.

Here the code I was testing with.

var TaskGroup = require("taskgroup").TaskGroup;
var testArray = Array(10000).join().split(',');
var grp = TaskGroup.create();

testArray.forEach(function(value, index) {
  grp.addTask(function() {
    console.log("task"+index);
  });
  /*
  setImmediate(function() {
    console.log("task"+index);
  });
  */
});

grp.run();

I only tested on Windows 8.1 so not sure what a Linux box will output

@pflannery pflannery changed the title Unusually large amount of memory usage An unusually large amount of memory usage Mar 20, 2015
@balupton
Copy link
Member

Perhaps linked to #12

@balupton
Copy link
Member

To be honest, I would be open to migrating TaskGroup to a pre-processor that actually cares about performance. CoffeeScript no longer cares about it at all, which is really sad. Perhaps CoffeeScript Redux is an option, otherwise we could look at TypeScript, or maybe just native javascript (not that it is actually a good language - unlike compared to 5 years ago)

/ ref https://developers.google.com/v8/experiments

@pflannery I've started a discussion about this here: https://discuss.bevry.me/t/move-from-coffeescript-to-es6/30/1

@pflannery
Copy link
Contributor Author

Seeing as the topic has gone and the ssl is telling me that it's not valid. I will just reply here for now :)

I agree, I think moving to native ES5\ES6 will give us more control over what code we use. I'm all for it.
I find coffee classes are a real pain when analysing them in a heap snapshot because they are all called ctor instead of the class names I define. This makes it very difficult.

I've been using v8-profiler to take snapshots of the test code I posted above and below you see that a Task instance is around 8K and the _events property generated by the EventEmitter is adding a 5-6kb per Task. So I'm wonder if its worth replacing the event emitter for something that is much lighter in weight?

image

@balupton
Copy link
Member

@pflannery
Copy link
Contributor Author

Just want to correct myself slightly, EventEmitter isn't adding 5-6k per task, instead it just seems that its referencing 5-6k.

@balupton
Copy link
Member

@pflannery can you do a screencast sometime of how you come up with that profile information - I've spent ages trying to figure it out and no luck!

@pflannery
Copy link
Contributor Author

Here's the code for generating the heap snapshot..

var fs = require('fs');
var profiler = require('v8-profiler');

function saveSnapshot(fileName) {
  var buffer = '';
  var snapshot = profiler.takeSnapshot("test");
  var t = snapshot.serialize(
    function iterator(data, length) {
      buffer += data;
    },function complete(){
      fs.writeFileSync(fileName, buffer)
    }
  );
}

var TaskGroup = require("./taskgroup").TaskGroup;
var testArray = Array(10000).join().split(',');
var grp = TaskGroup.create();

testArray.forEach(function(value, index) {
  grp.addTask(function() {
    // console.log("task"+index);
  }); 
});

grp.on("completed", function(){
  saveSnapshot("completed.heapsnapshot")
});

grp.run();

@pflannery
Copy link
Contributor Author

also here is a Memory Management Masterclass video on https://www.youtube.com/watch?v=LaxbdIyBkL0

@pflannery
Copy link
Contributor Author

I've created a gist with javascript that generates the cpu profile and ensures the deopt reasons will render in latest chrome - https://gist.github.com/pflannery/8e38a06a844a7fc362dc

setup instructions
  • create a temp folder
  • npm install taskgroup
  • npm install v8-profiler
  • run node taskgroup-cpuprofiling.js
  • open the generated complete.cpuprofile file in chrome-dev-tools -> profiler

@balupton
Copy link
Member

Sweet. I'll give it a go (watching the talk right now). I'm wondering if it is best to just do the performance tracking inside the browser with a browserified/webpack'd build of taskgroup. Your thoughts?

@pflannery
Copy link
Contributor Author

I've never used browserified or webpack so don't know what the outcome would be.

@pflannery
Copy link
Contributor Author

@balupton It wouldn;t let me post post than three times on the discuss thread and it wont let me add any more links in the post either so I've replied here

I've been analysing the heap generated from running the my test and i've seen that commenting out the following lines takes the heap snapshot from 114MB down to 31MB. (the snapshot was taken once during the taskgroup completed event)

Lines here and Lines here

So from that I think this confirms that the EventEmitter (taking up 53MB) is a big part of this problem and it also seems that the Map copying (taking up 14MB) is adding to it too.

@balupton
Copy link
Member

Interesting. A few notes:

Commenting out those events means that the taskgroup of 31MB doesn't work...

Perhaps the downsize of the commented out map options are actually because it disables domains.

My other thought is that it the completed event will be too early, as the normal garbage collection wouldn't have kicked in yet, it will have to be once the taskgroup has completed and the scope of the taskgroup instance has been forgotten. I'll do some evaluating of this today. Keep up the good work.

@balupton
Copy link
Member

@pflannery think you came across these rate limits: https://www.dropbox.com/s/sltz6p9gavq64vz/Screenshot%202015-03-24%2008.25.01.png?dl=0

Tomorrow it won't be a problem :-)

@balupton
Copy link
Member

@pflannery I've pushed some profiling things up to the es6 branch, you can now run:

cake compile
npm run-script profile

To generate the V8 profile report.

However, you can also run:

cake compile
npm run-script browserify

Then serve the web to your browser, and then profile from there! Got Chrome and Firefox both profiling 👍

@balupton
Copy link
Member

Chrome's TaskGroup (CoffeeScript version) profile (13 seconds):

https://www.dropbox.com/s/lnh01tehmtsv2cb/taskgroup-coffeescript.cpuprofile?dl=0

Conclusions from evaluating the running of tasks:

  1. A lot of time is spent emitting events, as such we should avoid emitting unnecessary events (this is what you found by commenting out the event names)
  2. A lot of time is spent waiting between tasks due to the setImmediate/nextTick calls to prevent locking, as such we should look into alternative ways of queuing tasks without causing lockups
  3. Creating a domain in the browser (so a eventemitter that wraps a try catch) takes 1ms self time, that's a lot of time
  4. Task.prototype.exit takes 1ms self time, that's a lot of time
  5. Cycling through the nested events to listen for them takes 1ms of self time, that is a lot of time
  6. Instantiating a event emitter class takes 2ms

@balupton
Copy link
Member

Chrome's TaskGroup (ES6 babel version) profile (8 seconds):

https://www.dropbox.com/s/alsdwj0u97sm09c/taskgroup-es6-babel.cpuprofile?dl=0

@balupton
Copy link
Member

Okay I was able to get it down to 3.5 seconds by removing maps and disabling nested events by default.

Removing the next tick stuff actually makes it go way slower, as the stack continues to grow.

@pflannery any idea how to evaluate the heap snapshots to tell if it has cleaned up properly, I have no clue how I am meant to read those snapshots.

@pflannery
Copy link
Contributor Author

Firstly what I do is replace babel's _applyConstructor code (at the very top of TaskGroup output) so we can see the Task class in the heap list, otherwise they get created as Object

var _applyConstructor = function (Constructor, args) {
  return new (Function.prototype.bind.apply(Constructor, [null].concat(args)));
};

The way I evaluate the heaps is to take three snapshots, one at the start, one just after run and then one when it's completed - like this example
Then I use the comparison view between them which shows what's been added or removed.

Also If you take a snapshot when destroy is emitted then that will show you what's left in memory after clean up.

The three snapshot technique will go away soon as there is now a Record Heap Allocations in chrome which gives us a timeline of heap allocations but it's not yet working in the current v8-profiler so we cant do it yet :'(

@balupton
Copy link
Member

The three snapshot technique will go away soon as there is now a Record Heap Allocations in chrome which gives us a timeline of heap allocations but it's not yet working in the current v8-profiler so we cant do it yet :'(

Working on Chrome Canary with #19 (comment)

Will try your suggestion, hopefully it will make the snapshots make sense.

@pflannery
Copy link
Contributor Author

just seen that using async WriteFile here was keeping all the snapshots and strings in memory so each snapshot ended up containing the previous snapshots. I've created PR #20 to resolve this

@balupton
Copy link
Member

just seen that using async WriteFile here was keeping all the snapshots and strings in memory so each snapshot ended up containing the previous snapshots. I've created PR #20 to resolve this

Merged, but still getting segfaults on heaps (neither sync or async work). I don't get segfaults on profile writing sync or async (both work).

@pflannery
Copy link
Contributor Author

Merged, but still getting segfaults on heaps (neither sync or async work). I don't get segfaults on profile writing sync or async (both work).
For me it does grow large but doesn't get a seg fault with either async or sync..

The latest changes seems to of cleared this problem. I wonder how much of this was removing domains and how much was removing coffee?

@balupton
Copy link
Member

@pflannery domains are still enabled by default in node land (just disabled in browser land)

@balupton
Copy link
Member

@pflannery does the latest commits make this any better? Notably: 9edcd0f#commitcomment-10406680

@pflannery
Copy link
Contributor Author

@balupton seem my reply to you commit comments

@balupton
Copy link
Member
balupton commented Jun 4, 2016

@balupton balupton closed this as completed Jun 4, 2016
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants