1288993 - Run valgrind-mochitest twice a day as a Tier 2 job

Comment 3

•

8 years ago

Possibly related to https://github.com/mozilla/ouija/issues/186

Assignee

Comment 5

•

8 years ago

now that cron.yml is hooked up, lets work on this.

Assignee

Comment 6

•

8 years ago

Attached patch run mochitest-valgrind on m-c on the nightly builds (obsolete) — Details — Splinter Review

I am happy to take this bug- I know we want this twice/day, not sure if depending on the nightlies are a good idea- this should get us started. Please inform on how to test this or what else I should do

Assignee: jseward → jmaher

Status: NEW → ASSIGNED

Attachment #8832958 - Flags: feedback?(dustin)

Comment 7

•

8 years ago

Comment on attachment 8832958 [details] [diff] [review] run mochitest-valgrind on m-c on the nightly builds Review of attachment 8832958 [details] [diff] [review]: ----------------------------------------------------------------- ::: .cron.yml @@ +29,5 @@ > > + - name: nightly-mochitest-valgrind > + job: > + type: decision-task > + treeherder-symbol: tc-M-V() That looks like a group name with no symbol - does that do something special in TreeHerder? I suspect something like Vg would be better; it will appear on the decision task row in treeherder. @@ +30,5 @@ > + - name: nightly-mochitest-valgrind > + job: > + type: decision-task > + treeherder-symbol: tc-M-V() > + triggered-by: nightly This needs some work still, but I don't think you want --triggered-by=nightly here -- you just want a "regular" decision task, only with a target tasks method @@ +35,5 @@ > + target-tasks-method: mochitest_valgrind > + projects: > + - mozilla-central > + when: > + - {hour: 16, minute: 0} It would probably be good to run this at a different time from the nightlies, just to get a more even task load. ::: taskcluster/taskgraph/target_tasks.py @@ +138,5 @@ > return [l for l in filtered_for_project if filter(full_task_graph[l])] > > > +@_target_task('mochitest_valgrind') > +def target_tasks_valgrind(full_task_graph, parameters): This is great -- exactly how target task methods were intended :)

Attachment #8832958 - Flags: feedback?(dustin) → feedback+

Assignee

Comment 8

•

8 years ago

Attached patch run mochitest-valgrind twice/day — Details — Splinter Review

thanks for the feedback, I have adjusted this and I believe what I have is more in line with a final solution- please r- if there are nits or if I am doing something wrong.

Attachment #8832958 - Attachment is obsolete: true

Attachment #8833291 - Flags: review?(dustin)

Wes Kocher (:KWierso) (Not reading bugmail; email directly if needed)

Comment 9

•

8 years ago

Comment on attachment 8833291 [details] [diff] [review] run mochitest-valgrind twice/day Review of attachment 8833291 [details] [diff] [review]: ----------------------------------------------------------------- ::: .cron.yml @@ +29,5 @@ > + job: > + type: decision-task > + treeherder-symbol: Vg > + target-tasks-method: mochitest_valgrind > + projects: This is `run-on-projects` now (but the format is the same)

Attachment #8833291 - Flags: review?(dustin) → review+

Pulsebot

Comment 10

•

8 years ago

Pushed by jmaher@mozilla.com: https://hg.mozilla.org/integration/mozilla-inbound/rev/b8370948ee4a Run valgrind-mochitest twice a day as a Tier 2 job. r=dustin

Comment 11

•

8 years ago

bugherder

https://hg.mozilla.org/mozilla-central/rev/b8370948ee4a

Status: ASSIGNED → RESOLVED

Closed: 8 years ago

status-firefox54: --- → fixed

Resolution: --- → FIXED

Phil Ringnalda (:philor)

Comment 12

•

8 years ago

https://treeherder.mozilla.org/logviewer.html#?job_id=75018174&repo=mozilla-central [task 2017-02-07T04:01:56.233593Z] Traceback (most recent call last): [task 2017-02-07T04:01:56.233644Z] File "/home/worker/checkouts/gecko/taskcluster/mach_commands.py", line 165, in taskgraph_decision [task 2017-02-07T04:01:56.233690Z] return taskgraph.decision.taskgraph_decision(options) [task 2017-02-07T04:01:56.233745Z] File "/home/worker/checkouts/gecko/taskcluster/taskgraph/decision.py", line 106, in taskgraph_decision [task 2017-02-07T04:01:56.233794Z] create_tasks(tgg.optimized_task_graph, tgg.label_to_taskid, parameters) [task 2017-02-07T04:01:56.233847Z] File "/home/worker/checkouts/gecko/taskcluster/taskgraph/create.py", line 76, in create_tasks [task 2017-02-07T04:01:56.233874Z] f.result() [task 2017-02-07T04:01:56.233947Z] File "/home/worker/checkouts/gecko/python/futures/concurrent/futures/_base.py", line 398, in result [task 2017-02-07T04:01:56.233983Z] return self.__get_result() [task 2017-02-07T04:01:56.234036Z] File "/home/worker/checkouts/gecko/python/futures/concurrent/futures/thread.py", line 55, in run [task 2017-02-07T04:01:56.234086Z] result = self.fn(*self.args, **self.kwargs) [task 2017-02-07T04:01:56.234146Z] File "/home/worker/checkouts/gecko/taskcluster/taskgraph/create.py", line 108, in create_task [task 2017-02-07T04:01:56.234180Z] res.raise_for_status() [task 2017-02-07T04:01:56.234234Z] File "/home/worker/checkouts/gecko/python/requests/requests/models.py", line 840, in raise_for_status [task 2017-02-07T04:01:56.234274Z] raise HTTPError(http_error_msg, response=self) [task 2017-02-07T04:01:56.234330Z] HTTPError: 409 Client Error: Conflict for url: http://taskcluster/queue/v1/task/XYQVC7MnQA2wZSjl949hXg

Status: RESOLVED → REOPENED

Resolution: FIXED → ---

Assignee

Comment 13

•

8 years ago

I had thought these changes were backed out, but in dxr and my local mozilla-inbound checkout I see all code from the patch, and on mozilla-central, I can see the Vg job: https://treeherder.mozilla.org/#/jobs?repo=mozilla-central&filter-searchStr=Gecko%20Decision%20Task%20opt%20Decision%20task%20for%20cron%20job%20nightly-mochitest-valgrind%20cron(Vg)&selectedJob=75391724 I think next up is getting more tests running under Vg. In looking at the Vg task, I don't see mochitest-valgrind tests running? We have this transform: https://dxr.mozilla.org/mozilla-central/source/taskcluster/taskgraph/target_tasks.py#150 :dustin, could you help shed light on why you think we run the Vg task, but not the mochitest-valgrind tests?

Status: REOPENED → ASSIGNED

Flags: needinfo?(dustin)

Comment 14

•

8 years ago

from https://bugzilla.mozilla.org/show_bug.cgi?id=1339148: for example code coverage and valgrind ran on this task: https://treeherder.mozilla.org/#/jobs?repo=mozilla-central&revision=4ec373fafebf79846cd5fde0561ac02fa0bb9647&filter-searchStr=cron&group_state=expanded valgrind is defined in cron.yml: https://dxr.mozilla.org/mozilla-central/source/.cron.yml#42 which calls: target-tasks-method: mochitest_valgrind and the transform is here: https://dxr.mozilla.org/mozilla-central/source/taskcluster/taskgraph/target_tasks.py#152 and the definition of the tests are here: https://dxr.mozilla.org/mozilla-central/source/taskcluster/ci/test/tests.yml#808

Comment 16

•

8 years ago

Ugh, I mid-aired myself, having written out a long explanation. Here's what I did, in summary: looked at the decision task (Vg), and at the logs, to see that the target task method filtered out all but four tasks, so it is probably to blame. Then I found a valgrind task in full-task-graph.json, and looked at its attributes. I mentally executed the filter in the target task method against those attributes, and noted that the `unittest_suite` attribute is not what you want (it is "mochitest" in this case). I think you want to check both `unittest_suite` and `unittest_flavor`.

Flags: needinfo?(dustin)

Assignee

Comment 17

•

8 years ago

Attached patch use proper taskcluster attributes to define cron tasks (obsolete) — Details — Splinter Review

thanks for the pointer :dustin. I believe I have this correct for valgrind and code coverage, so please review and let me know what you think.

Attachment #8836841 - Flags: review?(dustin)

Comment 18

•

8 years ago

Comment on attachment 8836841 [details] [diff] [review] use proper taskcluster attributes to define cron tasks Review of attachment 8836841 [details] [diff] [review]: ----------------------------------------------------------------- I think this *might* work, but can definitely be clearer (at least clear enough that it's not uncertain whether it would work..) ::: taskcluster/taskgraph/target_tasks.py @@ +157,5 @@ > # only select platforms > if platform not in ['linux64']: > return False > + if task.attributes.get('unittest_suite') or \ > + task.attributes.get('unittest_flavor'): Is this conditional is meant to guard against KeyError in the accesses below? If so, it should be "and" not "or". @@ +158,5 @@ > if platform not in ['linux64']: > return False > + if task.attributes.get('unittest_suite') or \ > + task.attributes.get('unittest_flavor'): > + if not (task.attributes['unittest_suite'].startswith('mochitest-valgrind') or No suite names start with mochitest-valgrind. Check out the full-task-graph.json and find the valgrind test to see what attributes it has (I think the suite is `mochitest`, but double-check me) @@ +159,5 @@ > return False > + if task.attributes.get('unittest_suite') or \ > + task.attributes.get('unittest_flavor'): > + if not (task.attributes['unittest_suite'].startswith('mochitest-valgrind') or > + task.attributes['unittest_flavor'].startswith('mochitest-valgrind')): This will probably end up accidentally doing what you want, since you combine these with "or", and since no non-mochitest suites have a flavor named `mochitest-valgrind`. @@ +169,5 @@ > @_target_task('nightly_code_coverage') > def target_tasks_code_coverage(full_task_graph, parameters): > """Target tasks that generate coverage data.""" > def filter(task): > + platform = task.attributes.get('test_platform') I can't tell what the diff is in this hunk....

Attachment #8836841 - Flags: review?(dustin) → review-

Assignee

Comment 19

•

8 years ago

Attached patch use proper taskcluster attributes to define cron tasks — Details — Splinter Review

after discussing over vidyo, I understand more of what I am doing. I verified this with data from the task-graph.json I did on try server: https://public-artifacts.taskcluster.net/KQ_z07M7Qo-xCesUkJuk7g/0/public/task-graph.json

Attachment #8836841 - Attachment is obsolete: true

Attachment #8838622 - Flags: review?(dustin)

Wes Kocher (:KWierso) (Not reading bugmail; email directly if needed)

Comment 20

•

8 years ago

Comment on attachment 8838622 [details] [diff] [review] use proper taskcluster attributes to define cron tasks Review of attachment 8838622 [details] [diff] [review]: ----------------------------------------------------------------- What's got two thumbs and likes this patch?

Attachment #8838622 - Flags: review?(dustin) → review+

Pulsebot

Comment 21

•

8 years ago

Pushed by jmaher@mozilla.com: https://hg.mozilla.org/integration/mozilla-inbound/rev/bb77e8d293e0 adjust target tasks to use correct taskcluster attributes. r=dustin

Comment 22

•

8 years ago

bugherder

https://hg.mozilla.org/mozilla-central/rev/bb77e8d293e0

Status: ASSIGNED → RESOLVED

Closed: 8 years ago → 8 years ago

Resolution: --- → FIXED

Assignee

Comment 23

•

8 years ago

this is deployed and I assume working properly (the code coverage cron task is). I do not see any test jobs related to valgrind, is it possible there are none ready?

Flags: needinfo?(jseward)

Comment 24

•

8 years ago

(In reply to Joel Maher ( :jmaher) from comment #23) > this is deployed and I assume working properly Joel, that's great to hear. > I do not see any test jobs related to valgrind, is it possible there > are none ready? I am not sure what I need to provide here in order to complete the picture. Currently I have it that if you push to try with the syntax "-b o -p linux64 -u mochitest-valgrind -t none", you get a v/mochi run, which shows up in the usual way in Treeherder, and that is what I'd hoped to have auto-run. There is an entry in .cron.yml that looks plausible: - name: nightly-mochitest-valgrind job: type: decision-task treeherder-symbol: Vg target-tasks-method: mochitest_valgrind run-on-projects: - mozilla-central when: - {hour: 16, minute: 0} - {hour: 4, minute: 0} Is that what you were looking for, or something else? Sorry to be so vague about this.

Flags: needinfo?(jseward) → needinfo?(jmaher)

Assignee

Comment 25

•

8 years ago

so this entry is supposed to trigger a valgrind build and/or all tests (i.e. mochitest-valgrind). This is defined here: https://dxr.mozilla.org/mozilla-central/source/taskcluster/taskgraph/target_tasks.py#152 @_target_task('mochitest_valgrind') def target_tasks_valgrind(full_task_graph, parameters): """Target tasks that only run on the cedar branch.""" def filter(task): platform = task.attributes.get('test_platform') if platform not in ['linux64']: return False if task.attributes.get('unittest_suite', '').startswith('mochitest') and \ task.attributes.get('unittest_flavor', '').startswith('valgrind-plain'): return True return False return [l for l, t in full_task_graph.tasks.iteritems() if filter(t)] and my assumption was that this would launch: https://dxr.mozilla.org/mozilla-central/source/taskcluster/ci/test/tests.yml#761 I guess my question is- are we expecting those tests to be run twice/day? If not, then we have more work to do.

Flags: needinfo?(jmaher)

Comment 26

•

8 years ago

(In reply to Joel Maher ( :jmaher) from comment #25) > I guess my question is- are we expecting those tests to be run twice/day? > If not, then we have more work to do. I am lost, unfortunately. Can we talk on irc?

Comment 27

•

8 years ago

(In reply to Joel Maher ( :jmaher) from comment #25) > I guess my question is- are we expecting those tests to be run twice/day? Yes, it is those. Although they are a few lines further down the file now: https://dxr.mozilla.org/mozilla-central/source/taskcluster/ci/test/tests.yml#773

Assignee

Comment 28

•

8 years ago

I am not clear why this isn't working. For example we have a mochitest-valgrind-1 definition here: https://public-artifacts.taskcluster.net/H9uSK3PmSF2hnA55WX_ygg/0/public/task-graph.json attributes build_platform "linux64" build_type "opt" e10s false kind "test" run_on_projects test_chunk "1" test_platform "linux64" unittest_flavor "valgrind-plain" unittest_suite "mochitest" unittest_try_name "mochitest-valgrind" and in our cron.yml target task that we call ( https://dxr.mozilla.org/mozilla-central/source/taskcluster/taskgraph/target_tasks.py#152 ): @_target_task('mochitest_valgrind') def target_tasks_valgrind(full_task_graph, parameters): """Target tasks that only run on the cedar branch.""" def filter(task): platform = task.attributes.get('test_platform') if platform not in ['linux64']: return False if task.attributes.get('unittest_suite', '').startswith('mochitest') and \ task.attributes.get('unittest_flavor', '').startswith('valgrind-plain'): return True return False return [l for l, t in full_task_graph.tasks.iteritems() if filter(t)] I really do not know why these are not scheduled, for example: https://treeherder.mozilla.org/#/jobs?repo=mozilla-central&revision=106a96755d3bcebe64bbbc3b521d65d262ba9c02&filter-searchStr=cron%20valgrind :dustin, can you see anything that is going wrong here?

Flags: needinfo?(dustin)

Updated

•

8 years ago

Status: RESOLVED → REOPENED

Flags: needinfo?(dustin)

Resolution: FIXED → ---

Updated

•

8 years ago

Flags: needinfo?(dustin)

Comment 29

•

8 years ago

It actually did run those jobs https://tools.taskcluster.net/task-group-inspector/#/NK8dGPYGR9WOXbJrmfesfQ?_k=ybsuyh what's not clear is, why they didn't show up in treeherder. https://queue.taskcluster.net/v1/task/ce1Plsk2RjCDtNqB2DW6vQ has "routes": [ "tc-treeherder.v2.mozilla-central.106a96755d3bcebe64bbbc3b521d65d262ba9c02.-1", "tc-treeherder-stage.v2.mozilla-central.106a96755d3bcebe64bbbc3b521d65d262ba9c02.-1" ], and "extra": { "treeherder": { "jobKind": "test", "groupSymbol": "tc-M-V", "collection": { "opt": true }, "machine": { "platform": "linux64" }, "groupName": "Mochitests on Valgrind executed by TaskCluster", "tier": 1, "symbol": "10" } the decision task (the Vg that does show up) has "routes": [ "index.gecko.v2.mozilla-central.latest.firefox.decision", // not relevant to TH "tc-treeherder.v2.mozilla-central.106a96755d3bcebe64bbbc3b521d65d262ba9c02.-1", "tc-treeherder-stage.v2.mozilla-central.106a96755d3bcebe64bbbc3b521d65d262ba9c02.-1" ], and "extra": { "treeherder": { "symbol": "Vg", "groupSymbol": "cron" } } Greg, based on what you know of the TH integration, can you see why those might not have shown up? I tried turning off exclusions, etc., and no luck.

Flags: needinfo?(dustin) → needinfo?(garndt)

Greg Arndt [:garndt]

Comment 30

•

8 years ago

I turned off the "excluded jobs" filter and see it: https://treeherder.mozilla.org/#/jobs?repo=mozilla-central&revision=106a96755d3bcebe64bbbc3b521d65d262ba9c02&filter-searchStr=linux64%20tc-m-v&group_state=expanded&filter-tier=1&filter-tier=2&filter-tier=3&selectedJob=80414218&exclusion_profile=false

Flags: needinfo?(garndt)

Assignee

Comment 31

•

8 years ago

odd, I tried that and it looks like dustin tried that- either way, this is working. :jseward, do you have what you need here? Is there a plan for getting these green and showing again?

Flags: needinfo?(jseward)

Comment 32

•

8 years ago

(In reply to Joel Maher ( :jmaher) from comment #31) Great! > :jseward, do you have what you need here? Nearly! One more question: how do I find these URLs? Is there a way for me to see all the valgrind runs and nothing else? > Is there a plan for getting these green and showing again? There are 3 sources of failure: (1) Timeouts caused by valgrind. I looked at these a while back and can get back to them. (2) Errors reported by valgrind. I can fix the real ones and suppress the false ones, and have slowly been doing so. (3) Failures that would have occurred anyway (running natively). These are a bit of a problem because there's no easy way to distinguish them from (2) without having to look at all the failing chunks -- in both cases they go orange. Ideally they could be a different colour. I filed bug 1341406 about that.

Flags: needinfo?(jseward) → needinfo?(jmaher)

Assignee

Comment 33

•

8 years ago

here is a link to help you: https://treeherder.mozilla.org/#/jobs?repo=mozilla-central&filter-searchStr=valgrind&exclusion_profile=false I go to: https://treeherder.mozilla.org/#/jobs?repo=mozilla-central search for 'valgrind' then click the 'excluded jobs' which allows you to see the results. Thanks for the information about the greening up! These are getting greener :) Please close this if you feel that there is nothing else to do here.

Flags: needinfo?(jmaher)

Comment 34

•

8 years ago

(In reply to Joel Maher ( :jmaher) from comment #33) Joel, Dustin, thank you for doing this! A perhaps better URL can be constructed by searching for the "tc-M-V" string: https://treeherder.mozilla.org/#/jobs?repo=mozilla-central&filter-searchStr=tc-M-V&exclusion_profile=false So it seems to work. But I noticed just now an interesting anomaly, which you can see at (eg) https://treeherder.mozilla.org/#/jobs?repo=mozilla-central&revision=eb23648534779c110f3a1f2baae1849ae4a9c570&filter-searchStr=tc-M-V&exclusion_profile=false Each run would normally display 40 chunk results inside the tc-M-V parentheses, but this one -- and others I've seen -- display 80. At first, I thought that each job had been run twice. But no, what seems to have happened is that treeherder is displaying together the result of two different sets of runs, one of which was requested at "Sat Mar 4, 17:04:04" and the other at "Sun Mar 5, 5:02:55". Is that expected? I assume this is somehow related to the fact that there were no merges to m-c over the weekend (or at least in the interval between the two abovementioned dates) and so the the two builds are regarded as identical. Is it possible to fix this easily?

Flags: needinfo?(jmaher)

Comment 35

•

8 years ago

Treeherder indexes by revision, so if the jobs ran on the same revision, no, there is no distinction.

Assignee

Comment 36

•

8 years ago

the problem here is that we are running twice/day and we have no pushes to m-c, so it schedules tests on the most recent revision, which happens to be a duplicate. Would you like to go once/day? This is only a problems on low volume periods of time.

Flags: needinfo?(jmaher)