[go: nahoru, domu]

Open Bug 117790 Opened 23 years ago Updated 2 years ago

apply default charset by domain

Categories

(MailNews Core :: Internationalization, enhancement)

enhancement

Tracking

(Not tracked)

People

(Reporter: ftang, Unassigned)

Details

I want to have a new feature in the mail/news code. We should handle those
message which do not have charset label by their domain name.
For example, if one message do not have charset label in mail header or body,
and it is send from xxx.co.jp, then we should use ISO-2022-JP as the charset.
we should do thsi to .tw, .cn, .hk, .ko

we should have the remapping as the following

.jp=ISO-2022-JP
.tw=BIG5
.cn=GB18030
.ko=EUC-KR
.hk=BIG5
.fr=ISO-8859-1
There is a feature to set a charset per folder.
I am not sure how this domain name approach will fit with the existing behavior.
Status: NEW → ASSIGNED
Some comments on this idea:

1. This method can capture some of the messages but not all.
   Some top-level domain names don't map to a single encoding.
   E.g. .gr may map to 8859-7 or windows-1253.
2. Newsgroups may map to a specific encoding and may have nothing
   to do with the top domain name. E.g. newsgroup "X" may use
   Big5 by convention no matter where the message comes from.
3. The top country domain names cannot account for ALL messages.
   For example, more and more people use .com, .biz, etc. for 
   different languages. In other words, association of language to
   the top domain name is breakign down. This trend is likely
   to continue.

Thus, using this method alone may not work that well.
However, it may work well if it is used as an option. 

For example, we may use the current folder encoding approach as the
basis one but may offer as a "toggle" option this method of quickly trying
out the next possible encoding. 

Ex. The use gets an unreadable message in Greek. The user has set the
     folder encoding to ISO-8859-7. Now the user turns on some quickly
     accessible option (quick shortcut keys, for example). This selects the
     other Greek encoding automatically, Windows-1253. If this works, the user
     does nothing additional. If it doesn not, we also offer the full 
     encoding menu to re-set the encoding.

In summar, we might be to use the proposal but it alone probably will
not be sufficient. It might be better to set it to the default
folder encoding instead. As a quick option to get to the next 
encoding possibiluty, this method could be very effective.
>1. This method can capture some of the messages but not all.
>   Some top-level domain names don't map to a single encoding.
>   E.g. .gr may map to 8859-7 or windows-1253.
Then don't apply such algorithm to it. For example, we can map .tw to BIG5 and
keep using the current default charset if we got it from .gr
We need a siturational solution which work for some area but not for those area
which we should NOT apply to.

>2. Newsgroups may map to a specific encoding and may have nothing
>   to do with the top domain name. E.g. newsgroup "X" may use
>   Big5 by convention no matter where the message comes from.
How many percent of newsgroup did this kind of thing ? Can you use the "FORCE
charset" in the folder in this case. 

>3. The top country domain names cannot account for ALL messages.
>   For example, more and more people use .com, .biz, etc. for 
>   different languages. In other words, association of language to
>   the top domain name is breakign down. This trend is likely
>   to continue.
As I said, for those top level domain, you map to NOTHING and let other
information (the current behavior) kick in.

>Thus, using this method alone may not work that well.
Agree. I never think it WILL.

we should think more carefully before we proceede.




Target Milestone: --- → mozilla1.2
Target Milestone: mozilla1.2alpha → ---
Product: MailNews → Core
Product: Core → MailNews Core
QA Contact: ji → i18n
This is feature request.
Assignee: nhottanscp → smontagu
Severity: normal → enhancement
Status: ASSIGNED → NEW
OS: Windows NT → All
Hardware: x86 → All
Assignee: smontagu → nobody
Severity: normal → S3
You need to log in before you can comment on or make changes to this bug.