[go: nahoru, domu]

Page MenuHomePhabricator

Some new users do not have account creation log events
Closed, DeclinedPublic

Description

As far as I know, a wiki's logging table should include an entry with log_type = 'newusers' for each newly registered user (since the logging system was created, of course).

However, this is not the case for a small number of accounts. The problem appears to be ongoing as of Jul 2024.

The problem seems to exist across all wikis. As an example, here is the count of cases from the Chinese Wikipedia:

SELECT
    LEFT(user_registration, 6) AS registration_month,
    COUNT(*) AS users_with_no_logged_creation
FROM user
LEFT JOIN actor
ON user_id = actor_user
LEFT JOIN logging
ON
    log_actor = actor_id
    AND log_type = 'newusers'
WHERE
    log_timestamp IS NULL
    AND user_registration IS NOT NULL
GROUP BY
    LEFT(user_registration, 6)
registration_monthusers_with_no_logged_creation
2006011
2006032
2006045
20060565
20060664
......
202401156
20240299
202403222
202404126
202405251
202406215
202407170
(Total)48749

As another example, here is the count for Commons:

registration_monthusers_with_no_logged_creation
2006013
2006021
2006042
20060515
20060618
......
202403207
202404143
202405180
202406167
202407123
(Total)377349

Event Timeline

It's possible this is expected behavior, but I couldn't find any documentation saying so.

Could it be accounts created by email? I.e. log_action = 'byemail'. Or log_action = 'create2', too.
In these cases, the actor is the user who creates the account, not the account created.
Try query for log_namespace = 2 AND log_title = REPLACE(user_name, ' ', '_') instead.

@matej_suchanek thank you very much! I didn't realize that and that does seem to be a very big part of the puzzle.

When I use both strategies (actor ID match and title match), the number of users with missing creation logs on Chinese Wikipedia is cut in half (now 26,395). Moreover, virtually all those remaining cases come from the past (many of them seem to be related to SUL), so there is no ongoing pattern of creations without logs:

users with no logged creations zhwiki.png (438×562 px, 27 KB)

Given that, I don't think there is anything actually remaining to do here, because I don't think it would be feasible or valuable to try to make the historical account creation logs 100% complete. So I'm going to close this task, but others should feel welcome to reopen it if they think the situation calls for some concrete action or further investigation.