[go: nahoru, domu]

US20170177567A1 - Analyzing Web Site for Translation - Google Patents

Analyzing Web Site for Translation Download PDF

Info

Publication number
US20170177567A1
US20170177567A1 US15/447,289 US201715447289A US2017177567A1 US 20170177567 A1 US20170177567 A1 US 20170177567A1 US 201715447289 A US201715447289 A US 201715447289A US 2017177567 A1 US2017177567 A1 US 2017177567A1
Authority
US
United States
Prior art keywords
language
translation
translated
content
web
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Abandoned
Application number
US15/447,289
Inventor
Enrique Travieso
Adam Rubenstein
William Fleming
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
MotionPoint Corp
Original Assignee
MotionPoint Corp
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Family has litigation
First worldwide family litigation filed litigation Critical https://patents.darts-ip.com/?family=32872168&utm_source=google_patent&utm_medium=platform_link&utm_campaign=public_patent_search&patent=US20170177567(A1) "Global patent litigation dataset” by Darts-ip is licensed under a Creative Commons Attribution 4.0 International License.
Application filed by MotionPoint Corp filed Critical MotionPoint Corp
Priority to US15/447,289 priority Critical patent/US20170177567A1/en
Assigned to MOTIONPOINT CORPORATION reassignment MOTIONPOINT CORPORATION ASSIGNMENT OF ASSIGNORS INTEREST (SEE DOCUMENT FOR DETAILS). Assignors: FLEMING, WILLIAM, RUBENSTEIN, ADAM, TRAVIESO, ENRIQUE
Publication of US20170177567A1 publication Critical patent/US20170177567A1/en
Abandoned legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/40Processing or translation of natural language
    • G06F40/58Use of machine translation, e.g. for multi-lingual retrieval, for server-side translation for client devices or for real-time translation
    • G06F17/289
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/95Retrieval from the web
    • G06F17/2247
    • G06F17/2705
    • G06F17/2818
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/10Text processing
    • G06F40/12Use of codes for handling textual entities
    • G06F40/14Tree-structured documents
    • G06F40/143Markup, e.g. Standard Generalized Markup Language [SGML] or Document Type Definition [DTD]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/20Natural language analysis
    • G06F40/205Parsing
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/40Processing or translation of natural language
    • G06F40/42Data-driven translation
    • G06F40/44Statistical methods, e.g. probability models
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L67/00Network arrangements or protocols for supporting network services or applications
    • H04L67/01Protocols
    • H04L67/02Protocols based on web technology, e.g. hypertext transfer protocol [HTTP]
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y10TECHNICAL SUBJECTS COVERED BY FORMER USPC
    • Y10STECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y10S707/00Data processing: database and file management or data structures
    • Y10S707/99941Database schema or data structure
    • Y10S707/99942Manipulating data structure, e.g. compression, compaction, compilation
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y10TECHNICAL SUBJECTS COVERED BY FORMER USPC
    • Y10STECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y10S707/00Data processing: database and file management or data structures
    • Y10S707/99941Database schema or data structure
    • Y10S707/99943Generating database or data structure, e.g. via user interface
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y10TECHNICAL SUBJECTS COVERED BY FORMER USPC
    • Y10STECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y10S707/00Data processing: database and file management or data structures
    • Y10S707/99941Database schema or data structure
    • Y10S707/99944Object-oriented database structure
    • Y10S707/99945Object-oriented database structure processing
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y10TECHNICAL SUBJECTS COVERED BY FORMER USPC
    • Y10STECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y10S707/00Data processing: database and file management or data structures
    • Y10S707/99941Database schema or data structure
    • Y10S707/99948Application of database or data structure, e.g. distributed, multimedia, or image
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y10TECHNICAL SUBJECTS COVERED BY FORMER USPC
    • Y10STECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y10S707/00Data processing: database and file management or data structures
    • Y10S707/99951File or database maintenance
    • Y10S707/99952Coherency, e.g. same view to multiple users
    • Y10S707/99953Recoverability
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y10TECHNICAL SUBJECTS COVERED BY FORMER USPC
    • Y10STECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y10S707/00Data processing: database and file management or data structures
    • Y10S707/99951File or database maintenance
    • Y10S707/99952Coherency, e.g. same view to multiple users
    • Y10S707/99955Archiving or backup

Definitions

  • the present invention generally relates to web sites, and more particularly relates to dynamic translation of web site content to another language.
  • Another technique involves managing the translation process by deploying human translators and either maintaining multiple web sites for each language, or re-architecting the existing web site back-end technology to accommodate multiple languages.
  • This requires significant resources in terms of time and cost, including a high level of complexity and duplication of effort.
  • Dynamic and e-commerce sites present additional challenges, as the information to be translated resides in multiple places (e.g., a Structured Query Language database, static Hyper Text Markup Language pages and dynamic Hyper Text Markup Language page templates) and each translated site must interface with the same e-commerce or back-end engine. Further, as the web site changes, ongoing maintenance must also be handled. This approach will yield vastly superior translations that are suitable for professional web sites of large organizations, but at great cost. Most organizations simply do not have, or do not want to invest in, the resources necessary to handle this task internally.
  • FIG. 1 is a block diagram illustrating the system architecture of a conventional web site.
  • the web site of FIG. 1 is presented in a first language, such as English.
  • FIG. 1 shows a web server 112 connected to the Internet 116 via a web connection.
  • a public user 118 such as a person using a computer with a web connection, can access the web server 112 via the Internet 116 and download information, such as a web page 114 , from the web server 112 for viewing.
  • the web server 112 is operated by programming logic 110 , consisting of instructions on how to retrieve, serve, and accept information for processing.
  • the web server 112 further has access to a database 102 of information, as well as Hyper Text Markup Language (HTML) template files 104 , graphics files 106 and multimedia files 108 , all of which constitute the web site served by web server 112 .
  • HTML Hyper Text Markup Language
  • FIG. 2 is a block diagram illustrating the system architecture of a conventional web site presented in two languages.
  • the web site of FIG. 2 is presented in a first language, such as English (as shown above for FIG. 1 ) and in a second language, such as Spanish.
  • FIG. 2 shows the web server 112 and the other English language components described in FIG. 1 , including the database 102 of information, the HTML template files 104 , graphics files 106 , multimedia files 108 and programming logic 110 .
  • FIG. 2 further shows the public user 118 accessing the web server 112 via the Internet 116 and downloading information, such as a web page 202 in the English or Spanish language.
  • FIG. 2 also shows the Spanish language components 204 of the web site, including the database 208 of information, the HTML template files 214 , graphics files 216 , multimedia files 210 and programming logic 212 .
  • the aforementioned Spanish language components are managed by a multi-lingual content manager 206 , which manages requests for information in the dual languages.
  • FIG. 2 further shows that the web server 112 must be re-engineered to serve multiple sets of content in different languages.
  • the deployment of the Spanish language components 204 of FIG. 2 requires a significant expenditure of time and resources. Further, the deployment requires there-engineering of the web server 112 , adding to the time and cost associated with the deployment. Additionally, once the Spanish language components 204 have been established, they must be kept synchronized with the English language components, resulting in a recurring cost. This is disadvantageous, as most organizations simply do not have the resources necessary to perform this task.
  • the method on an information processing system includes retrieving a first web content in a first language from a web site, the first web content corresponding to a second web content wherein the second web content is a translation in a second language of the first web content.
  • the method further includes dividing the first web content into a plurality of translatable components and generating a unique identifier for each of the plurality of translatable components of the first web content.
  • the method further includes matching each of the plurality of translatable components of the first web content to a plurality of translated components of the second web content using the unique identifier of each of the plurality of translatable components of the first web content. If a translatable component of the first web content is not matched to a translated component of the second web content, the method further includes designating the translatable component of the first web content for translation into the second language.
  • the web server includes a web connection for retrieving a first web content in a first language from a web site, the first web content corresponding to a second web content wherein the second web content is a translation in a second language of the first web content.
  • the web server further includes a processor for dividing the first web content into a plurality of translatable components and generating a unique identifier for each of the plurality of translatable components of the first web content.
  • the processor further for matching each of the plurality of translatable components of the first web content to a plurality of translated components of the second web content using the unique identifier of each of the plurality of translatable components of the first web content. If a translatable component of the first web content is not matched to a translated component of the second web content, the processor designates the translatable component of the first web content for translation into the second language.
  • a computer program product including computer instructions for synchronizing web content.
  • the computer instructions include instructions on an information processing system for retrieving a first web content in a first language from a web site, the first web content corresponding to a second web content wherein the second web content is a translation in a second language of the first web content.
  • the computer instructions further include instructions for dividing the first web content into a plurality of translatable components and generating a unique identifier for each of the plurality of translatable components of the first web content.
  • the computer instructions further include instructions for matching each of the plurality of translatable components of the first web content to a plurality of translated components of the second web content using the unique identifier of each of the plurality of translatable components of the first web content. If a translatable component of the first web content is not matched to a translated component of the second web content, the computer instructions further include instructions for designating the translatable component of the first web content for translation into the second language.
  • the present invention allows for the deployment of a corresponding web site in another language with a reduced amount of configuring of the original web site. This reduces the amount of information Technology (IT) resources that must be consumed by the providers of the original web site and reduces the amount of time necessary for deployment. Also as discussed below, only a single link is required to be deployed on the original web site in order to provide access to the corresponding web site in another language. This is beneficial as it reduces the amount of time and effort that must be expended by the providers of the original web site in order to release the corresponding web site in another language.
  • IT information Technology
  • the present invention is further advantageous because it allows for the use of human translation, thereby producing a high quality translation of the original web site in another language. This is beneficial as it reduces or avoids the use machine translation, which can be of low quality. Additionally, the present invention preserves the formatting of the original web site, including when a translation is of a larger size or length that the original text. This is beneficial as it allows for the preservation of the look and feel of the original web site, thereby allowing users to maintain familiarity with the corresponding web site in another language.
  • the present invention is further advantageous because it supports large, complex and rapidly-changing web sites.
  • the present invention supports web sites with any number of web pages, links, downloads and other materials, thereby allowing for greater flexibility and usability of the present invention.
  • the present invention also supports web sites that change continuously or periodically, as it regularly polls the web site to discern changes and initiate corresponding translations. This is beneficial as it reduces the amount of time and effort that is expended on the maintenance of a corresponding web site in another language.
  • the present invention is further advantageous because it provides a corresponding web site in a second language, thereby meeting the needs of customers speaking the second language. This is beneficial as it generates traffic consisting of customers speaking the second language and provides customers speaking the second language a self-service e-commerce option. This is also beneficial because it provides more accessible shopping opportunities for customers in the second language and provides a more user-friendly environment for these clients in the second language.
  • FIG. 1 is a block diagram illustrating the system architecture of a conventional web site.
  • FIG. 2 is a block diagram illustrating the system architecture of a conventional web site presented in two languages.
  • FIG. 3 is a block diagram illustrating the system architecture of a web site presented in two languages, in one embodiment of the present invention.
  • FIG. 4 is a block diagram illustrating the system architecture of the present invention, in one embodiment of the present invention.
  • FIG. 5 is an operational flow diagram depicting the process of the translation server, according to a preferred embodiment of the present invention.
  • FIGS. 6A-6C illustrate an operational flow diagram depicting the serving process of the translation server, according to a preferred embodiment of the present invention.
  • FIG. 7 is a block diagram depicting the serving process in an ASP model of the translation server, according to a preferred embodiment of the present invention.
  • FIG. 8 is a block diagram depicting the serving process in a web service model of the translation server, according to a preferred embodiment of the present invention.
  • FIG. 9 is a screenshot of a WebCATT interface used for viewing a translatable component, in one embodiment of the present invention.
  • FIG. 10 is a screenshot of a WebCATT interface used for viewing a translatable component along with a corresponding translation, in one embodiment of the present invention.
  • FIG. 11 is a screenshot of a WebCATT interface used for editing a translatable component, in one embodiment of the present invention.
  • FIG. 12 is a screenshot of a WebCATT interface used for viewing a translation queue, in one embodiment of the present invention.
  • FIG. 13 is an operational flow diagram depicting the process of WebCATT, according to a preferred embodiment of the present invention.
  • FIG. 14 is an operational flow diagram depicting the process of the spider, according to a preferred embodiment of the present invention.
  • FIG. 15 is an operational flow diagram depicting the synchronization process according to a preferred embodiment of the present invention.
  • FIG. 16 is a block diagram showing a computer system useful for implementing the present invention.
  • the present invention overcomes problems with the prior art by providing an efficient and easy-to-implement system and method for dynamic language translation of a web site.
  • FIG. 3 is a block diagram illustrating the system architecture of a web site presented in two languages, in one embodiment of the present invention.
  • the web site of FIG. 3 is presented in a first language, such as English, and a second language, such as Spanish.
  • FIG. 3 shows the web server 112 of FIG. 1 connected to the Internet 116 via a web connection.
  • a public user 118 accesses the web server 112 via the Internet 116 and download information, such as a web page, from the web server 112 for viewing.
  • the user 118 utilizes a client application, such as a web browser, on his client computer to connect to the web site of via the network 116 .
  • the user 118 browses through the products or services offered by the web site by navigating through its web pages.
  • the web server 112 is operated by programming logic 110 and the web server 112 further has access to a database 102 of information, as well as HTML template files 104 , graphics files 106 and multimedia files 108 , all of which constitute the English components of the web site served by web server 112 .
  • FIG. 3 further shows translation server 300 situated apart from and existing independently from the web server 112 .
  • the translation server 300 embodies the main functions of the present invention, including the provision of a web site in a secondary language, such as Spanish.
  • the translation server 300 provides the secondary language components of a base web site, which is provided by web server 112 , without requiring integration with the base web site or re-configuring or re-engineering of the web server 112 .
  • the deployment of the secondary language components FIG. 3 requires a significantly reduced expenditure of time and resources than the deployment of FIG. 2 . Further, the deployment of FIG. 3 does not require the re-engineering of the web server 112 . Additionally, once the secondary language components have been established by the translation server 300 , they are automatically kept synchronized with the English language components of the base web site. Thus, the system of the present invention is advantageous as it reduces the amount of time, effort and resources that are required to deploy a secondary language web site.
  • FIG. 4 is a block diagram illustrating the system architecture of the present invention, in one embodiment of the present invention.
  • FIG. 4 presents an alternative point of view of the system architecture of the present invention.
  • FIG. 4 shows a web site 414 representing a web site in a first language such as English that is connected to the Internet 412 via a web connection.
  • FIG. 4 further shows a user 416 that utilizes a web connection to the Internet 412 to browse and navigate the web pages served by the web site 414 .
  • FIG. 4 further shows the translation server 400 , corresponding to the translation server 300 of FIG. 3 , and a translation database 406 for use by the translation server 400 in storing of translatable components during the serving of web pages in a secondary language such as Spanish.
  • a secondary language such as Spanish.
  • WebCATT Web Computer Aided Translation Tool
  • FIG. 4 also shown in FIG. 4 is the Web Computer Aided Translation Tool (WebCATT), which is a tool for aiding a human 418 or an admin 410 in translating the components of a web site in a first language.
  • a spider 404 for use in analyzing and sizing a web site 414 .
  • the translation server 400 , and WebCATT tool 408 are connected to a web server 402 , which is the conduit through which all web actions of the above tools are channeled.
  • the translation server 400 , WebCATT tool 408 are described in greater detail below.
  • the computer systems of translation server 400 , WebCATT tool 408 , spider 404 and web server 402 are one or more Personal Computers (PCs) (e.g., IBM or compatible PC workstations running the Microsoft Windows 95/98/2000/ME/CE/NT/XP operating system, Unix, Linux, Macintosh computers running the Mac OS operating system, or equivalent), Personal Digital Assistants (PDAs), game consoles or any other information processing devices.
  • PCs Personal Computers
  • the computer systems of translation server 400 , WebCATT tool 408 , spider 404 and web server 402 are server systems (e.g., SUN Ultra workstations running the SunOS operating system or IBM RS/6000 workstations and servers running the AIX operating system).
  • Internet network 412 is a circuit switched network, such as the Public Service Telephone Network (PSTN).
  • PSTN Public Service Telephone Network
  • the network 412 is a packet switched network.
  • the packet switched network is a wide area network (WAN), such as the global Internet, a private WAN, a local area network (LAN), a telecommunications network or any combination of the above-mentioned networks.
  • network 412 is a wired network, a wireless network, a broadcast network or a point-to-point network.
  • the translation server 400 is the back-end application responsible for the conversion of web pages to another language.
  • the translation server 400 parses each incoming HTML page into translatable components, substitutes each incoming translatable component with an appropriate translated component, and returns the translated web page back to the online user 416 .
  • Page conversion is performed on the fly each time an online user 416 requests a page in the second or alternate language.
  • the translation server 400 will translate the page if enough translated content is available to meet a customer specified translation threshold. If this is not the case, then the page will be returned in the first or original language.
  • a translatable component includes any one of a text segment, an image file with text to be translated, a multimedia file with text or audio to be translated, a file with text to be translated, a file with image with text to be translated, a file with audio to be translated and a file with video with at least one of text and audio to be translated.
  • the page conversion process follows seven major steps. In a first step, for each text segment encountered, if a translation is available it replaces it with the translated text segment. If no translation is available, either the text remains in the original language or a machine translation is performed on the fly, depending on the customer's preference. In a second step, for each linked file (images, PDF files, Flash movies, etc.) encountered, if a translated file is available the HTML link tag is rewritten so that it points to the translated file. If a translated file is not available, the original link tag is left untouched. In a third step, any relative Universal Resource Locator (URL) found in the page is converted to an absolute URL. This is necessary because the browser resolves relative URLs based on the URL of the current page. In the case of a translated page, the URL of the page is actually in the translation server 400 . As a result, the browser would request all files and links with relative URLs from the translation server 400 , which is not the correct original location.
  • a first step for each text segment encountered,
  • each JavaScript block is parsed for directive tags that indicate text content to translate. Images are automatically detected by recognition of the file extension. Script tags that reference external JavaScript files are rewritten so that they are redirected to the translation server 400 . They are then parsed and translated in a separate browser Hyper Text Transfer Protocol (HTTP) request.
  • HTTP Hyper Text Transfer Protocol
  • each link to another web page is rewritten so that the original URL is redirected to the translation server 400 . When an online user clicks on a rewritten link, the request then goes directly to the translation server 400 and the page is in turn translated. Links to other web pages placed in JavaScript blocks are automatically recognized, either by extension or by pre-defined customer specific URL patterns, and also rewritten for redirection. This feature, which keeps the user in the alternate language as they browse the site, is called “implicit navigation”.
  • the translation server 400 automatically schedules the web page for translation by placing it in the WebCATT 408 translation queue, in the event a translation cannot be found for one or more text segments or linked files in the page.
  • FIG. 5 is an operational flow diagram depicting the process of the translation server 400 , according to a preferred embodiment of the present invention.
  • the operational flow diagram of FIG. 5 depicts the process of the translation server 400 of responding to a user request for a web page in a secondary language.
  • the operational flow diagram of FIG. 5 begins with step 502 and flows directly to step 504 .
  • the translation server 400 receives a request from a user 416 on a web site 414 , the web site 414 having a first web content in a first language such as English.
  • the request such as an HTTP request or a Simple Mail Transfer Protocol (SMTP) request, calls for a second web content in a second language such as Spanish.
  • the second web content is a human or machine translation in a second language of the first web content.
  • the first language includes any one of English, French, Spanish, German, Portuguese, Italian, Japanese, Chinese, Korean, and Arabic and the second language is different than the first language and includes any one of English, French, Spanish, German, Portuguese, Italian, Japanese, Chinese, Korean, and Arabic.
  • the translation server 400 retrieves the first web content from the web site 414 .
  • the translation server 400 divides the first web content into a plurality of translatable components.
  • the translation server 400 generates a unique identifier for each of the plurality of translatable components of the first web content. For a text segment, the translation server 400 can generate a unique identifier using a hash code, a checksum or a mathematical algorithm.
  • the translation server 400 identifies a plurality of translated components of the second web content using the unique identifier of each of the plurality of translatable components of the first web content.
  • the translation server 400 arranges or puts the plurality of translated components of the second web content to preserve a format that corresponds to the first web content.
  • the translation server 400 can arrange or put the plurality of translated components of the second web content to preserve a format that corresponds to the first web content, including putting formatting tags that are not visible in the first web content.
  • the translation server 400 provides the second web content in response to the request that was received.
  • the control flow of FIG. 5 stops.
  • FIGS. 6A-6C illustrate an operational flow diagram depicting the serving process of the translation server 400 , according to a preferred embodiment of the present invention.
  • the operational flow diagram of FIGS. 6A-6C depicts the process of the translation server 400 of providing a web page in a secondary language in response to a user request.
  • the operational flow diagram of FIGS. 6A-6C provides more detail with regards to steps 508 - 514 of FIG. 5 above.
  • the operational flow diagram of FIGS. 6A-6C begins with step 601 and flows directly to step 602 .
  • Step 601 begins with a source HTML page or first web content of step 506 of FIG. 5 .
  • step 602 at least one portion of the first web content is parsed into a translatable component.
  • step 603 it is determined whether the end of the file of the first web content is reached. If the result of the determination is affirmative, then control flows to step 612 . Otherwise, control flows to step 604 .
  • step 604 it is determined whether the translatable component that was parsed in step 602 is a text segment. If the result of the determination is affirmative, then control flows to step 605 . Otherwise, control flows to step 614 .
  • step 605 a hash code or other unique identifier is computed for the text segment.
  • step 606 using the unique identifier, a matching translated text segment is looked up in a cache.
  • step 607 it is determined whether the matching translated text segment is found in the cache. If the result of the determination is affirmative, then control flows to step 608 . Otherwise, control flows to step 618 .
  • step 608 it is determined whether there was multiple matching translated text segments found in the cache. If the result of the determination is affirmative, then control flows to step 620 . Otherwise, control flows to step 609 .
  • step 620 the correct translated segment is determined using the sequence constraints and a character by character comparison.
  • step 609 it is determined whether translation of the text segment is suppressed or not yet translated. If the result of the determination is affirmative, then control flows to step 621 . Otherwise, control flows to step 610 .
  • step 610 the matching translated text segment is set as a target segment.
  • step 621 the current text segment is set as the target segment.
  • step 640 the target segment is added to the output web content, or second web content (i.e., the translated HTML page or the output HTML page).
  • step 623 the second web content is output for provision to the user requesting the web page.
  • step 612 it is determined whether there is an incomplete translation of the current web page, i.e., the first web content. If the result of the determination is affirmative, then control flows to step 613 . Otherwise, control flows to step 611 .
  • step 613 the current web page is scheduled for translation.
  • step 611 the translation activity performed by the translation server 400 in servicing the current web page is recorded in the translation database 406 .
  • step 625 it is determined whether the percentage of the current web page, i.e., the first web content, is translated is above a threshold. If the result of the determination is affirmative, then control flows to step 624 . Otherwise, control flows to step 626 .
  • step 624 the second web content or translated HTML page is output for provision to the user requesting the web page.
  • step 626 the current web page or first web content is output unchanged for provision to the user requesting the web page.
  • step 614 it is determined whether the translatable component parsed in step 602 is a translatable file such as a PDF file, an image file, etc. If the result of the determination is affirmative, then control flows to step 615 . Otherwise, control flows to step 629 . In step 629 , it is determined whether the translatable component parsed in step 602 is a link to another translatable page. If the result of the determination is affirmative, then control flows to step 628 . Otherwise, control flows to step 627 . In step 627 , a tag is added to the translated HTML page to indicate a link (this is described in greater detail below). In step 628 , the link is modified to redirect the URL (this is described in greater detail below).
  • step 615 a translated file corresponding to the translatable file is looked up in a cache.
  • step 616 it is determined whether the translated file was found. If the result of the determination is affirmative, then control flows to step 617 . Otherwise, control flows to step 633 .
  • step 633 the translated file is looked up in the translation database 406 .
  • step 635 it is determined whether the translated file was found. If the result of the determination is affirmative, then control flows to step 634 . Otherwise, control flows to step 632 .
  • step 634 the translated file that was found is stored in the cache.
  • step 632 an incomplete translation is recorded in the translation database 406 .
  • step 630 the original web page is set as the target file.
  • step 631 the target file is added to the translated HTML page.
  • step 617 it is determined whether translation is suppressed for the translatable file. If the result of the determination is affirmative, then control flows to step 630 . Otherwise, control flows to step 636 .
  • step 636 the translated file is set as the target file.
  • step 618 the using the unique identifier, a matching translated text segment is looked up in the translation database 406 .
  • step 622 it is determined whether the matching translated text segment is found in the database. If the result of the determination is affirmative, then control flows to step 619 . Otherwise, control flows to step 637 .
  • step 619 the translated segment that was found is stored in the cache. In step 637 , an incomplete translation is recorded in the translation database 406 .
  • step 638 it is determined whether a machine translation of the text segment can be performed. If the result of the determination is affirmative, then control flows to step 639 . Otherwise, control flows to step 621 . In step 639 , the machine translation is set as the target segment.
  • the translation server 400 can be presented in a variety of models.
  • the translation server 400 converts full web pages or script files at a time and delivers them directly to the online user 416 .
  • the links in a web page are rewritten so that the request is redirected to the translation server 400 .
  • the URL of the translation server 400 for a fictional customer called ABC Widgets is defined as: http://trans1.motionpoint.net/abcwidgets/enes/
  • the translation server 400 in turns reads the original URL passed in the query string (i.e., everything after the question mark), requests the page from the ABC Widgets server, converts it to the alternate language, and sends it back to the user 416 .
  • FIG. 7 is a block diagram depicting the serving process in an ASP model of the translation server 400 , according to a preferred embodiment of the present invention.
  • the user 416 clicks on a link of a web page in a first language on the web site 414 .
  • the link points to a page to be translated.
  • the translation server 400 receives the request and processes it.
  • the translation server 400 forwards the request to the web site 414 and in a third step 706 , the web site 414 provides the page to the translation server 400 for translation.
  • the translation server 400 translates the page using the translations in the translation database 406 and sends the translated page to the user 416 .
  • the translated content is not delivered directly to the online user 416 . Instead the customer's web site server 414 issues the request for translation to the translation server 400 , which acts as a web translation service.
  • the translation server 400 can convert full pages or just specific text segments and/or files. When directly translating text segments or files, multiple translation requests can be issued, one per segment or file, or multiple segments and files can be translated in a single hatched request.
  • FIG. 8 is a block diagram depicting the serving process in a web service model of the translation server 400 , according to a preferred embodiment of the present invention.
  • a first step 802 the user 416 clicks on a link of a web page in a first language on the web site 414 .
  • the link points to a page to be translated.
  • the web site server 414 receives the request and processes it.
  • the web site 414 provides the page to the translation server 400 for translation.
  • the translation server 400 provides the translated page to the web site 414 .
  • the web site 414 sends the translated page to the user 416 .
  • the hosting and management model defines who deploys and manages the hardware and operating system software in which the software components of the present invention reside.
  • the hosted and managed model is a fully outsourced model in which one entity hosts the service and all translated data. Under this model, one entity deploys the translation server 400 and WebCATT 408 software on its own hardware. All hardware and software is provisioned and maintained by this entity, so the customer web site 414 has no responsibility for any hardware or software related to the service.
  • the hosting entity is responsible for: 1) provisioning, installing, configuring and maintaining all hardware, including communication to the Internet 412 , 2) installing, configuring and maintaining all operating system, web server and database server software, 3) installing, configuring and managing on an ongoing basis the translation server 400 and WebCATT 408 software and 4) maintaining staff and subcontractors that use the WebCATT 408 software to perform the translations that maintain the alternate language site in sync with the original language site.
  • the translation server 400 and WebCATT 408 software are installed on the customer web site's hardware.
  • the customer web site 414 is responsible for: 1) provisioning, installing, configuring and maintaining all hardware, including communication to the Internet 412 , 2) installing, configuring and maintaining all operating system, web server and database server software.
  • the managing entity responsible for: 1) installing, configuring and managing on an ongoing basis the translation server 400 and WebCATT 408 software, 2) maintaining staff and subcontractors that use the WebCATT 408 software to perform the translations that maintain the alternate language site in sync with the original language site.
  • the components of the present invention can be deployed in dedicated or shared server environments.
  • multiple customer web sites share the same hardware.
  • multiple translation servers 400 are installed in the same web server 402 , which connects to a database server containing the database 406 of translated data.
  • a single WebCATT 408 software installation may is also shared by multiple customers. This setup is cost efficient and can be used for small and medium size sites with low-to-moderate web site traffic.
  • the system of the present invention does not save or maintain translated pages. Although, this may be useful for sites with static content, it becomes unmanageable for sites whose content is generated dynamically from database information in response to a user's request. Instead, the present invention stores only those components within a web page that require translation, i.e., translatable components.
  • Parsing is the process of breaking-up an HTML page submitted for translation into its translatable and non-translatable components.
  • Non-translatable components simply pass through the system unchanged (except for URLs that need rewriting).
  • Translatable components are processed and replaced by their translated counterparts if available.
  • a translatable component includes any one of a text segment, an image file with text to be translated, a multimedia file with text or audio to be translated, a file with text to be translated, a file with image with to be translated, a file with audio to be translated and a file with video with at least one of text and audio to be translated.
  • a text segment is a chunk of text on the page as defined by the HTML that surrounds it.
  • a text segment can range from a single word to a paragraph or multiple paragraphs.
  • a file is any type of external content that resides on a file, is linked from within the page, and may require translation. Typical types of linked files found in web pages are images, PDF files, MS Word documents and Flash movies.
  • the above example page would by default be parsed into the following six text segments: 1) ‘Widget Product Information’, 2) ‘Widget’, 3) ‘Model#123’, 4) ‘This widget is very useful for many chores around the house.’, 5) ‘Product photo’, 6) ‘Click here to return to the home page’.
  • the above example page would further be parsed into the following one file: img/widgetpicture.gif.
  • the parsing system breaks-up text segments according to the HTML tags in the page.
  • the sentence ‘Widget Model#123’ was broken up into two segments because there was an HTML bold tag ( ⁇ b>) in the middle of it.
  • the parsing system is flexible and allows defining, on per-customer basis, which HTML tags are formatting tags that should not break up text segments.
  • the translation server 400 performed several changes to the page. Each text segment was replaced with a corresponding translation. It is important to note that the text of the image description (‘Product photo’) placed in the ‘alt’ attribute of the image tag was recognized as a text segment and translated.
  • the translation server 400 can recognize text segments inside attributes of HTML tags, such as the text in buttons of a form.
  • the URL of the image tag was replaced to point to a translated image file.
  • the translation server 400 only executes this action if a translated file has been defined (since many images do not have text and thus do not require translation), otherwise it does not change the URL of the image (except to make the URL absolute if it is not).
  • the ‘ES_24.gif’ image file was defined in WebCATT 408 as the translation for the ‘widgetpicture.gif’ file.
  • the URL of the home page link was rewritten from ‘http://www.abcwidgets.com’ to ‘http://trans1.motionpoint.netlabcwidgets/enes/?24; http://www.abcwidgets.com’ in order to redirect it to the translation server 400 . This is done so when the online user clicks on the ‘Click here to return to the home page’ link, the request will go directly to the translation server 400 and the home page will also be translated. This process is called implicit navigation and it is explained in more detail below.
  • Implicit navigation is a translation server 400 feature that keeps an online user 416 in the alternate language as he/she browses a web site. Implicit navigation is implemented by rewriting the URLs in the applicable links inside a page as the page is being translated, so they are redirected to the translation server 400 . As a result, not only is the page translated, but also all applicable links to other translated pages within the page are modified so that when the consumer clicks on the linked page it will also be automatically translated.
  • the translation server 400 prefixes the original URL with the URL of the translation server 400 , so the original URL becomes the query string to the translation server 400 URL.
  • the request goes to the translation server 400 , which reads the query string to obtain the original URL to be translated and requests the page to be translated from this URL.
  • the translation server 400 then converts the page received to the alternate language and delivers the translated page to the consumer directly.
  • the original URL is only one part of the query string.
  • the other part of the query string is a special numeric action ID, which provides information about the type of conversion request being performed.
  • “1” indicates no action.
  • “2” indicates pages that were not translated, or for which the translation did not meet the minimum translation percentage, and therefore should not be returned.
  • “4” indicates HTML to be translated is being submitted as POST data when processing a POST request. If this action is not specified, then the URL passed in the query string is accessed in order to obtain the HTML to be translated.
  • “8” indicates that all relative URLs in the HTML should be converted to absolute URLs. This is necessary only in GET requests. If relative URLs are not used in the document, this action should not be specified.
  • “16” indicates implicit navigation is enabled.
  • “32” indicates the request includes cookie data to be passed back as cookies to all URLs to be translated.
  • “64” indicates that all links in the page are to be disabled. This overrides action ID “16” if also specified.
  • “128” indicates translation of the page is to be disabled. This is used to process tags without affecting content.
  • “8192” indicates a translation is being requested from WebCATT 408 for previewing.
  • the translation server 400 adds special HTML tags to the web page to allow highlighting translated as opposed to not-translated segments, disabling links to other pages, adding alternate language hover preview features, and allowing editing a segment or file by clicking on it in the preview page.
  • the translation server 400 URL is: http://trans1.motionpoint.net/abcwidgets/enes/
  • action ID is “24”, which means to enable implicit navigation and to convert relative URLs to absolute.
  • implicit navigation can be pre-defined by domain and/or URL patterns.
  • domain and/or URL patterns In a typical scenario, only pages being served from a specific domain(s) should be translated.
  • the implicit navigation domains are defined as abcwidgets.com and abcwidgets.net, then only URLs within those two domains will be rewritten.
  • URL patterns can be used. For example, if ABC Widgets wishes not to translate the careers and investor relations sections of their site, then the following two example Exclude URL patterns could be used: 1) abcwidgets.com/careers/and 2) abcwidgets.com/investor/
  • Implicit navigation can also be controlled from within the HTML to be translated through the use of directive tags or directive attributes. These are explained in detail in below.
  • the system of the present invention enables users to access the same original language e-commerce database in multiple languages. Since the translation server 400 processes web pages after they have left the customer web site 414 , but before they reach the user 416 , it does not affect a web server's e-commerce technology. As a result, the same web site 414 can be accessed in multiple languages, and all users are accessing the same e-commerce database simultaneously.
  • an auction web site can allow users in different countries to bid on the same item. Each user can view the site and bid on the item in his native language. Since all bids from the different countries are actually hitting the same web site and the same e-commerce engine, all bids occur in real time and each user can see in real-time what all the other users in all other countries are bidding.
  • the translation server 400 computes three 64-bit numeric hash codes from each incoming text segment.
  • the hash code function is optimized to spread the resulting hash code across the full range of 64-bit numeric values ( ⁇ 9223372036854775808 to 9223372036854775807).
  • the three hash codes are computed as follows: 1) hash code 1 is based on all characters in the segment, 2) hash code 2 is based on the odd characters in the segment and 3) hash code 3 is based on the even characters in the segment.
  • hash code 1 is based on all characters in the segment
  • hash code 2 is based on the odd characters in the segment
  • hash code 3 is based on the even characters in the segment.
  • the meaning of a word or phrase may change depending on the context in which it's being used. It is also possible that the translation itself may vary depending on the context or placement of a text segment, even if the original meaning does not change. As a result, it may be necessary to specify multiple translations for the same word or phrase, one for each usage context.
  • the text segment locking feature allows translators to do this by providing the ability to “lock” translated text segments together. When two or more translation text segments are locked together they are used only when the exact translation sequence is followed.
  • the translation to Spanish of the text segment “Virtual Brochures” can vary, depending on where it is used. Below is this segment used in an English HTML sentence: ⁇ b>Virtual Brochures ⁇ /b> are great. The corresponding translation to Spanish is: ⁇ b>Los Folletos Virtuales ⁇ /b>son convenientes. Another example of a segment used in an English HTML sentence: There are many great ⁇ b>Virtual Brochures ⁇ /b>. The corresponding translation to Spanish is: Hay muchos permittedes ⁇ b>Folletos Virtuales ⁇ /b>
  • the translation server 400 At conversion time, when the translation server 400 encounters the “Virtual Brochures” segment in the first sentence it looks up a corresponding translated segment and gets back two potential matches: “Los Folletos Virtuales” and “Folletos Virtuales”. It then proceeds to look up a translated segment for the next segment “are great” and gets back “son substantiales”. Since “son substantiales” is locked to “Los Folletos Virtuales”, the translation server 400 is able to determine that “Los Folletos Virtuales” is the correct translation to the previous segment “Virtual Brochures”.
  • the translation server 400 transparently handles form submissions via GET or POST methods. This means that all form data is forwarded to the original URL that processes the form and that the response page is converted to the alternate language.
  • the first step in the form handling is performed when an HTML page that has a form in it is being converted.
  • the translation server 400 simply rewrites the URL in the ACTION attribute of the ⁇ FORM> tag. This is done by prefixing the original URL with the URL of the translation server 400 , so the original URL becomes the query string to the translation server 400 URL, much like the implicit navigation feature in standard links.
  • the browser will perform the POST request to the translation server 400 , which will read the query string to obtain the original URL where the form is to be submitted and perform the POST to that URL, forwarding it all form data.
  • the translation server 400 then reads the response page, converts it to the alternate language, and delivers the translated page to the user directly.
  • the translation server 400 cannot simply rewrite the URL in the ACTION attribute of the ⁇ FORM> tag because in a GET method the form data is sent in the query string. As a result, the browser would replace the original URL with the form data and the translation server 400 would not know to what URL to submit the form data. To overcome this limitation, the translation server 400 adds a hidden field to the form whose value contains the original URL, and replaces the URL in the ACTION attribute of the ⁇ FORM> tag so the request is sent to the translation server 400 .
  • the browser will perform the GET submission to the translation server 400 , which will read the value of the hidden form field to obtain the original URL where the form is to be submitted and perform the GET submission to that URL, forwarding it all form data.
  • the translation server 400 then reads the response page, converts it to the alternate language, and delivers the translated page to the consumer directly.
  • the translation server 400 is capable of translating text segments and files located inside JavaScript or VBScript code. Common types of files can be recognized automatically by their standard extensions. The translation server 400 parses all JavaScript code blocks and replaces the URLs of all files for which a translation exists so it points to the translated file. Non-standard file extensions and URL patterns may be defined on a per-customer basis to allow the translation server 400 to recognize less common or proprietary file formats, or even dynamically generated files. File recognition and translation can also be controlled from within the JavaScript code through the use of directive tags. These are explained in detail below. Text segments inside script code that require translation must be explicitly identified by placing a set of directive tags around the text.
  • Translation of content inside JavaScript or VBScript include files is also supported.
  • a script include file is downloaded by the browser in a separate HTTP request and included in the web page as if it had appeared within the page. Include files are handled in the same manner as implicit navigation in standard links within the page.
  • the URL of the include file is rewritten so the original include file is prefixed with the URL of the translation server 400 and the original file URL becomes the query string to the translation server 400 URL.
  • the browser will then request the include file from the translation server 400 , which will read the query string to obtain the URL of the original include file and request it from its location.
  • the translation server 400 then reads the file, performs the appropriate conversions, and delivers the modified file to the browser for inclusion in the web page.
  • Directive tags and directive attributes are special HTML tags and attributes that allow more granular control over the translation and implicit navigation within in a web page.
  • Directive tags are special HTML comments tags that are ignored by the browser, but provide specific instructions to the translation server 400 .
  • Directive attributes are specially named attributes placed within an HTML tag that are also ignored by the browser, but provide specific instructions to the translation server 400 that apply only to the tag in which the attribute is placed.
  • Translation control tags and attributes are used to specify sections on a web page that should not get translated.
  • One important use of translation control tags is to delimit personal information, such as a persons name, address, credit card numbers, etc. that may show up in a web page, but which should not be processed—it simply passes through the translation server 400 without being translated or stored-for security and privacy issues.
  • the directive tag “mp_trans_partial_start & mp_trans_partial_end” signals the start and end of a partial translation section. This tag may be used at the top of a web page in conjunction with section translate tags to selectively translate sections of a page.
  • the directive tag “mp_trans_enable_start & mp_trans_enable_end” signals the start and end of a section to be translated within a partial translation section. All text and files within this section are translated.
  • the directive tag “mp_trans_disable_start & mp_trans_disable_end” signals the start and end of a section not to be translated when in normal translation mode.
  • the directive tag “mp_trans_machine_start & mp_trans_machine_end” signals that any text segments enclosed within the tags may be machine translated in the event that a human translation is not available.
  • the directive attribute “mpdistrans” disables translation of a file or of translatable text in a tag, such as alt, keywords or description meta-tag, or form buttons.
  • the directive attribute “mpnav” enables implicit navigation for listed attributes in the tag. This attribute can be used for tags that do not normally contain URLs, but do.
  • the directive attribute “mpdisnav” disables implicit navigation for all attributes or only listed attributes of the tag.
  • the directive attribute “mporgnav” forces original navigation for all attributes or only listed attributes of the tag. Original navigation will remove redirection to the translation server if found, otherwise it will leave the link intact. This directive attribute is discussed below with reference to one-link deployment.
  • the translation server 400 would process the above page as follows:
  • the directive tag “mp_trans_textjs_start & mp_trans_textjs_end” signals the start and end of a section inside a script block that contains text to be translated.
  • the directive tag “mp_trans_imgjs_start & mptrans_imgjs_end” signals the start and end of a section inside a script block that contains images, PDF, Flash or other files to be translated. Under most circumstances these tags are not needed as the translation server 400 JavaScript parser can automatically recognize common types of files by their standard extensions.
  • the directive tag “mp_trans_supressurljs_start & mp_trans_supressurljs_end” signals the start and end of a section inside a script block that inhibits the processing of URLs.
  • URLs are processed for implicit navigation, or to convert relative URLs to absolute URLs if implicit navigation is disabled. This tag may be necessary to avoid processing portions of URLs that are used to build up a final URL by means of concatenation.
  • the above CheckLoginForm function verifies that an online user has entered a login name and password before posting the LoginForm form in the page. If a user has not entered the required information, then a pop-up alert box shows an error message with details.
  • the text of the various error messages is assigned to variables and enclosed in a set of ‘mp_trans_textjs’ directive tags so it can be recognized and translated.
  • One of the primary goals of the TransMotion system is to eliminate or minimize the workload of a customer web site's IT department in order to deploy an alternate language web site.
  • the one-link deployment feature allows a customer to deploy the alternate language web site by simply placing one language-switching link in the home page of the original language site.
  • the one-link deployment is a combination of two features: (1) automatic flipping of the language-switching link, and (2) implicit navigation to maintain the user in the alternate language.
  • Automatic flipping of the language-switching link is specified by using the mporgnav directive attribute in the language-switching link.
  • the mporgnav directive attribute instructs the translation server 400 to rewrite the URL to support automatic language switching.
  • the translation server 400 When a user clicks the ‘Click here to see this site in Spanish’ language-switching link, the translation server 400 returns the home page translated, as shown below:
  • the translation server 400 in addition to translating the page, the translation server 400 also rewrites the URL in the language-switching link and performs implicit navigation of all other URLs in the page.
  • the translation server 400 rewrites the URL in the language-switching link so that the translation server 400 redirection is removed.
  • the mporgnav directive attribute is used to instruct the translation server 400 to do this.
  • the link text ‘Click here to see this site in Spanish’ is translated as ‘Haga project aqui para ver este sitio web en Ingles’ (which means ‘Click here to see this site in English’).
  • This automatic and simultaneous change of both the URL and the text (or image) in the language-switching link by the translation server 400 is what allows the user to flip back-and-forth between English and Spanish.
  • Implicit navigation is also performed in all the links on the page.
  • it was performed on the widgets.jsp page.
  • the widgets.jsp page is in turn translated and implicit navigation performed on all of its links within the abcwidgets.com domain. This process is repeated so that the user is always navigating the site in the alternate language.
  • the translation server 400 allows delivering customized content according to the language and/or locale that a user is viewing the site in.
  • the translation server 400 requests a web page for translation, it sends two cookies to the original web server called ‘mptranslan’ and ‘mptranscty’.
  • the value of the ‘mptranslan’ cookie is a 2 or 3-letter (upper-case) language code in compliance with the ISO 639 standard.
  • the value of the ‘mptranscty’ cookie is a 2-letter (upper-case) country code in compliance with the ISO 3166 standard.
  • Web site server software can determine if a page is being viewed in an alternate language and/or a different country by checking for these cookies. For example, by checking that the ‘mptranslan’ cookie exists, and that its value is ‘ES’, a web server can determine that a page is being served in Spanish and customize the content being served, such as sselling items that appeal more to Hispanics. In addition, if a company maintains operations in multiple countries, then it can use the ‘mptranscty’ cookie to determine the country and show only products sold or shipped to that country.
  • search engine When an online user 416 that is viewing a web site 414 in an alternate language performs an internal site search, it is natural for the user to enter the search keyword(s) in the alternate language.
  • the translation server 400 forwards the search keyword(s) to the original web site, the search engine will not be able to find any matching results, or might deliver incorrect results. This occurs because the web server search engine is matching the keyword(s) in the alternate language against a search index of keywords that are in the original language.
  • the translation server 400 provides an elegant solution to this problem by performing a real-time reverse machine translation on the search keyword(s) and forwarding the keyword(s) to the web server search engine in the original language.
  • Reverse machine translation is configured so it is performed only on the specific keyword field(s) of the search form(s) in a web site.
  • the system of the present invention is compatible with all Internet search engines, such as Google or AltaVista. These search engines utilize content from both the body and head of the HTML document to index a web page. To ensure transparent compatibility with Internet search engines, the system of the present invention translates all applicable text in the head of the document. This includes the page title, the page description meta-tag, and the keywords meta-tag.
  • the translation server 400 can use real-time machine translation in the event that a human translation is not yet available for a text segment. This an optional setting that can be specified per-customer, per-URL pattern and/or by means of directive tags.
  • Caching frequently used data in memory is necessary to minimize round trips to the database 406 .
  • caches There are two types of caches being used: dynamic and static.
  • a dynamic cache is one whose entries are removed from the cache when memory becomes scarce, and use a Most-Recently-Used (MRU) algorithm to keep the most relevant entries in the cache.
  • MRU Most-Recently-Used
  • the use of an MRU algorithm to manage the cache guarantees that the most frequently accessed and most recently used entries are always in the cache. This type of cache is used for large, long-lived caches.
  • the translation server 400 contains five memory caches, which are described in more detail below.
  • a main segment cache is a dynamic long-lived cache that stores ACTIVE translated text segments keyed by the composite key derived from the original (not yet translated) text segment's 64-bit hash codes. This allows a quick lookup of translation text. Segments are removed from this cache if they are deactivated in the WebCATT 408 .
  • a translation queue segment cache is a dynamic long-lived cache that stores the text segments of all pages that are in the translation queue. This allows the translation server 400 to determine that a specific text segment that has not yet been translated is already in the queue for translation without having to search the database. Segments are removed from this cache when they are activated in the WebCATT 408 .
  • a main file cache is a dynamic long-lived cache that stores ACTIVE files keyed by their names. This allows the quick lookup of a translated file. Files are removed from this cache if they are deactivated in the WebCATT 408 .
  • a translation queue file cache is a dynamic long-lived cache that stores the files of all pages that are in the translation queue. This allows the translation server 400 to determine that a file that has not yet been translated is already in the queue for translation without having to search the database. Files are removed from this cache when they are activated in the WebCATT 408 .
  • a translation queue page cache is a static long-lived cache that stores all pages that are in the translation queue. This allows the translation server 400 to determine that a page that has not yet been translated is already in the queue for translation without having to search the database.
  • a 64-bit hash code is used to determine if a page in the queue has changed and has to be re-scheduled for translation. Pages are removed from this cache when they are activated in the WebCATT 408 .
  • the translation server 400 is advantageous as it does not require IT integration with an existing web site infrastructure.
  • the present invention converts the outbound HTML stream after it has left the client web server 414 .
  • client web server 414 there is no need to re-architect an existing web site or build a separate web site for alternate language.
  • client storage or management of translated data required. Translated data is managed and maintained by the WebCATT 408 software outside of the wed site's database.
  • the translation server 400 is further advantageous as it works with any client web server hardware and software technology infrastructure. Further, it allows for evolution of the existing client's hardware and software technology infrastructure. Moreover, deployment of the present invention requires minimal effort as a reduced amount of client IT resources are required.
  • the one-link deployment feature involves the client placing one link on the web site 414 to provide access to the alternate language web site. Therefore, deployment is rapid and cost effective.
  • the WebCATT (Web Computer Aided Translation Tool) 408 is a web based Graphical User Interface (GUI) application that is used to perform and manage human translations.
  • GUI Graphical User Interface
  • the tool is built specifically for web (HTML) page translations. It can be used by professional translators to translate web site translatable components and by managers to manage the translation process. Since WebCATT 408 is a web-based application that is accessed via the Internet 412 , translators and managers can be located in different geographical areas.
  • WebCATT 408 is similar to other computer aided translation tools used by professional translation service organizations. WebCATT 408 supports localization, text recognition, fuzzy matching, translation memory, internal repetitions, alignment, and a glossary/terminology database. WebCATT 408 is designed for web site translation and includes other features optimized for web translation, such as What You See Is What You Get (WYSIWYG) HTML previewing and support for image/graphic translation.
  • WYSIWYG What You See Is What You Get
  • WebCATT 408 organizes the translation workload into web pages.
  • a web page is the HTML content generated by a specific URL address, regardless of whether that content is static (i.e., physically resides in the web server in a file with a html extension), or dynamic (i.e., the content is generated dynamically by combining information from a database and HTML templates). Dynamic pages that are dependent on session information (i.e., a shopping cart checkout page) are also supported.
  • a text segment is a chunk of text on the page as defined by the HTML that surrounds it.
  • a text segment can range from a single word to a paragraph or multiple paragraphs.
  • a file is any type of external content that resides on a file, is linked from within the page, and may require translation. Typical types of files found in web pages are images, PDF files, MS Word documents and Flash movies.
  • a file is translated by uploading a replacement file that has all text and/or sounds translated.
  • FIG. 9 is a screenshot of a WebCATT interface used for viewing a translatable component, in one embodiment of the present invention.
  • FIG. 9 shows a display area 902 in which a web page including translatable component in a first language (in this case, English) is displayed.
  • a section 904 including information associated with the web page displayed in display area 902 , such as page status, page URL, page ID, etc.
  • a section 906 including statistics associated with the web site from which the displayed web page is garnered, such as the number of files translated, the number of segments translated, the number of translations suppressed, etc.
  • FIG. 10 is a screenshot of a WebCATT interface used for viewing a translatable component along with a corresponding translation, in one embodiment of the present invention.
  • FIG. 10 shows a display area 1002 in which an original image file translatable component is displayed in a first language (in this case, English).
  • FIG. 10 shows a display area 1004 in which a translated image file is displayed in a second language (in this case, Spanish).
  • a section 1006 including information associated with the file displayed in display areas 1002 - 1004 , such as file status, file URL, file ID, etc.
  • FIG. 10 shows how WebCATT 408 allows a user to view a translatable component alongside a corresponding translated component for comparison.
  • FIG. 11 is a screenshot of a WebCATT interface used for editing a translatable component, in one embodiment of the present invention.
  • FIG. 11 shows a display area 1102 in which a web page including a translated component in a second language (in this case, Spanish) is displayed.
  • the display area 1102 provides a WYSIWYG web page preview feature that allows viewing the translated web page as it is being translated. Translations can often result in a significant amount of word growth (e.g., approx. 20% from English to Spanish) or shrinkage, which can result in carefully formatted web page layouts being knocked out of alignment by the longer text.
  • the WYSIWYG page preview feature allows translators to immediately see the translated web pages and quickly make adjustments in word choice in order to maintain the correct alignment and layout of the page when translated.
  • a section 1104 including information associated with the web page displayed in display area 1102 , such as page status, page URL, page ID, etc.
  • a section 1106 including statistics associated with the web site from which the displayed web page is garnered, such as the number of files translated, the number of segments translated, the number of translations suppressed, etc. In addition to each of those statistics, a breakdown of translated and not translated components is shown in both units and percentages.
  • a section 1110 provides a text segment edit form that allows a translator to edit text segments in the order they appear on the page.
  • This form features a fuzzy search feature that automatically shows and sorts existing segment matches in the database.
  • the translator can copy an existing translation from the search results area to use as a starting translation.
  • a section 1108 provides a file list form that allows a translator to preview all linked files on the page.
  • the list form allows the translator to select all files that do not require translation (e.g., an image with no text) and quickly tag them as such. It also allows a translator to select individual files for translation via the file edit form.
  • File translation involves uploading a translated file and translating the file text description if present.
  • the GUI of FIG. 11 allows a user to view the plurality of translated components placed into the format derived from the first, or source, content, thereby enabling a user to review how the translated components are rendered in the first content format.
  • the GUI of FIG. 11 further allows a user to highlight any of the plurality of translatable components, which are not yet translated, differently from translated components when previewing the plurality of translated components in the first content format.
  • the GUI of FIG. 11 further allows a user to display text when hovering over a translated component so as to view the first content corresponding to the translated component.
  • the GUI of FIG. 11 further allows a user to select at least one of the translated components when previewing the plurality of translated components in the first content format so as to edit the translated component and store the translated component that has been revised with the corresponding unique identifier.
  • the GUI of FIG. 11 further allows previewing in a multi-user environment so that more than one user can simultaneously view translated components rendered in the first content format.
  • WebCATT 408 also provides complete management of the translation process.
  • Web pages are scheduled for translation either automatically by the translation server 400 , or manually by a manager via upload of web pages or other type of content to be translated.
  • a web page is scheduled for translation it is placed in the translation queue of a specific customer.
  • Pages to be translated are scheduled for translation on a priority basis using algorithms based on the percentage of the page already translated and how often the page is being accessed on the original web server while it's in the translation queue. This allows the most important pages (i.e., most frequently accessed and those with smaller changes) to be translated first.
  • a manager can assign them for translation to a specific translator or translation service subcontractor. If assigned to a subcontractor, a subcontractor manager can then assign them to specific translators within the subcontractor organization or even to freelancers that work with them. Proofers can also be assigned. A subcontractor can assign its own proofers to pages and managers can also assign proofers to check the work of translators or subcontractors.
  • a web page must go through a series of status changes before it is available via the Internet.
  • a page can have any of the following statuses: NEW, IN-PRODUCTION, and ACTIVE.
  • NEW When a page is placed in the queue its status is NEW.
  • ACTIVE When a translator first accesses the page for the purpose of translating it, its status is changed to IN-PRODUCTION. After the page is fully translated and proofed, then a manager changes its status to ACTIVE. Only ACTIVE pages available via the Internet.
  • the text and files within the page maintain their own translation status.
  • the status for text segments and files is maintained both at the page level (i.e., one single overall status for all segments in the page and another one for all files in the page) and individually.
  • a text segment or file can have any of the following statuses: NEW, TRANSLATED, CONTRACTOR_PROOFED, PROOFED and ACTIVE.
  • the initial status is NEW.
  • After a translator translates the text or file the status is changed to IN-PRODUCTION.
  • CONTRACTOR_PROOFED When the translation is proofed by a subcontractor proofer the status is changed to CONTRACTOR_PROOFED and when it is proofed by an internal proofer the status is changed to PROOFED.
  • the manager changes the status to ACTIVE.
  • a page can only be activated after all segments and files within it are ACTIVE.
  • FIG. 12 is a screenshot of a WebCATT interface used for viewing a translation queue, in one embodiment of the present invention.
  • FIG. 12 shows a series of columns wherein a unit of information is provided for each page of the web site 414 listed on each row.
  • FIG. 12 shows a first column 1202 including unique page identifiers.
  • Column 1204 includes a URL for each page.
  • Column 1206 includes receipt data for each page.
  • Column 1208 includes a percentage statistic indicating the percentage of the page that has been translated.
  • Column 1210 indicates a status for each page.
  • Column 1212 indicates the contractor assigned to the page.
  • FIG. 13 is an operational flow diagram depicting the process of WebCATT 408 , according to a preferred embodiment of the present invention.
  • the operational flow diagram of FIG. 13 depicts the process by which WebCATT 408 , which provides a web based tool for managing language translations of content, queues and translates components of a web site 414 .
  • the operational flow diagram of FIG. 13 begins with step 1302 and flows directly to step 1304 .
  • WebCATT 408 retrieves a first content, or HTML source page, in a first language from the web site 414 .
  • WebCATT 408 parses the first content into a plurality of translatable components.
  • WebCATT 408 generates a unique identifier for each of the plurality of translatable components of the first content.
  • WebCATT 408 queues the plurality of translatable components and corresponding unique identifiers for human or machine translation into a second language.
  • WebCATT 408 stores a translated component and an associated unique identifier corresponding to the translatable component, thereby storing a plurality of translated components and corresponding unique identifiers.
  • step 1314 WebCATT 408 provides the plurality of translatable components and corresponding unique identifiers to a third party for human translation into a second language.
  • step 1316 the control flow of FIG. 13 stops.
  • WebCATT 408 is advantageous as it allows translators to work directly with live pages off the web site 414 being translated. Thus, the client web site 414 need not send information to the translation server 400 for translation. Furthermore, all web pages in a web site are automatically entered into the translation work queue by the WebCATT 408 spider 404 , described in greater detail below.
  • WebCATT 408 is further advantageous as WYSIWYG preview allows translators to see translated web pages, as they would appear on the live web site. This allows the translator to compensate for word growth or shrinkage that knocks a web page layout out of alignment.
  • a translated preview page is marked-up with special HTML & JavaScript to allow: 1) color coding of all text in the web page so the translator can see what is already translated, what remains to be translated and where the current text segment is located within the page, 2) clicking in text or a file to take the translator to a form to edit the translation for the text or file and 3) hovering the mouse over a text or file to pop up a window showing the original wording or file.
  • WebCATT 408 is further advantageous as pages are parsed into its translatable components and translators only work with these components, not a complex group of HTML files. All HTML and script code is hidden when using WebCATT 408 . WebCATT 408 is further beneficial as it can be utilized via the ASP model and translators can access it via the web. Translated pages can be delivered via the translation server 400 or saved as static html pages to be sent to client, wherein links among pages are modified so they reference the translated pages.
  • WebCATT 408 is further beneficial as it allows management of the translation process. Multiple user access levels are supported: managers, proofers, translators & sub-contractors. Mangers can assign work in the translation queue to translators, proofers and/or subcontractors. Subcontractor managers can in turn sub-assign work to subcontractor translators and proofers. Managers must activate web pages before the translation server 400 can deliver them.
  • a spider is a program that visits web sites and reads their pages and other information in order to create entries for an index such as a search engine index.
  • a search engine index For example, the major search engines on the Internet all have such a program, which is also known as a “crawler” or a “bot.”
  • Spiders are typically programmed to visit web sites that have been submitted by their owners as new or updated. Entire web sites or specific pages can be selectively visited and indexed. Spiders are called spiders because they usually visit many web sites in parallel at the same time, their “legs” spanning a large area of the “web.” Spiders can crawl through a web site's pages in several ways.
  • spiders for the major search engines on the Internet adhere to the rules of politeness for Web spiders that are specified in a standard for robot exclusion. This standard asks each server which files should be excluded from being indexed. It does not (or can not) go through a firewall. The standard also proscribes a special algorithm for waiting between successive server requests so that the spider doesn't affect web site response time for other users.
  • spiders The operations of a spider are in contrast with a normal web browser operated by a human that doesn't automatically follow links other than inline images and URL redirection.
  • FIG. 4 shows a spider 404 for use in analyzing and sizing a web site 414 .
  • the spider 404 is a tool that crawls specific web sites and performs any of a variety of actions.
  • the spider 404 can crawl a web site in order to populate the WebCATT translation queue with new or updated information.
  • the spider 404 may also gather content statistics that can be used to provide a monetary quote for deployment of the present invention.
  • FIG. 14 is an operational flow diagram depicting the process of spider 404 , according to a preferred embodiment of the present invention.
  • the operational flow diagram of FIG. 14 depicts the process by which spider 404 , which provides a web based tool for sizing a web site for language translation, retrieves and indexes translatable components of a web site 414 .
  • the operational flow diagram of FIG. 14 begins with step 1402 and flows directly to step 1404 .
  • spider 404 retrieves a first content, or HTML source page, in a first language from the web site 414 .
  • the first content in a first language is for translation into a second content in a second language.
  • the second web content is a human or machine translation in a second language of the first web content.
  • spider 404 parses the first content into a plurality of translatable components.
  • a translatable component includes any one of a text segment, an image file with text to be translated, a multimedia file with text or audio to be translated, a file with text to be translated, a file with image with to be translated, a file with audio to be translated and a file with video with at least one of text and audio to be translated.
  • spider 404 In step 1408 , spider 404 generates a unique identifier for each of the plurality of translatable components of the first content. For a text segment, the translation server 400 can generate a unique identifier using a hash code, a checksum or a mathematical algorithm. In step 1410 , spider 404 stores the plurality of translatable components and corresponding unique identifiers in the database 406 for human or machine translation into the second language.
  • spider 404 queues the plurality of translatable components and corresponding unique identifiers for human or machine translation into a second language.
  • spider 404 provides the plurality of translatable components and corresponding unique identifiers to WebCATT 408 for human translation into a second language.
  • spider 404 generates statistics based on the translatable components retrieved from the web site 414 . The statistics generated include a file count, a page count, a translatable segment count, a unique text segment count, a unique text segment word count and a word count.
  • the spider 404 can further generate a web page having a link to each file of the web site 414 .
  • the control flow of FIG. 14 stops.
  • the spider 404 can be pre-configured for each customer web site so that the use of directive tags and/or attributes is eliminated or minimized. This minimizes the workload of the customer web site's IT personnel. Further, the spider 404 can be separately pre-defined by domain and/or by URL pattern. This allows specifying sections of a web site to be translated without the need for placing directive tags in each web page.
  • the spider 404 is advantageous as it can be used to update the WebCATT 408 translation work queue. Further, spider 404 can be used to gather statistics about a web site 414 in order to allow estimating the amount of work involved in translating the web site and pricing accordingly.
  • Spider 404 can summarize word counts, segment counts, file counts and page counts of a web site 414 .
  • the spider 404 is further efficient and supplements the functions of WebCATT 408 as it works to save all unique text segments and file URLs in the database 406 for later translation into a second language. It can further create an HTML page containing links to all files of web site 414 , so the files can reviewed for translation at a later time.
  • the spider 404 is efficient in navigating a crawling a web site 414 as it can emulate a browser by saving and returning cookies. Spider 404 can further fill out and submit forms with pre-defined information and is able to establish a session and normalize session ID parameters for e-commerce sites. Spider 404 can further be configured to crawl only specific areas of a web site by defining include/exclude domains and URL patterns. Spider 404 can also be configured to send specific HTTP headers, such as the user-agent (i.e., type of browser). Spider 404 can be executed in a single computer or in distributed mode. In distributed mode, multiple machines work in conjunction to crawl the same web site simultaneously sharing the same database 406 .
  • Automatic maintenance involves automated maintenance of the alternate language web site so as to be maintained in synchronization with the original site with no human intervention or little additional effort.
  • Automatic maintenance is based on a function of the translation server 400 .
  • the act of viewing a never-before translated or a modified page in the alternate language enables the scheduling of the web page for translation.
  • the spider agent 404 can be used to crawl a web site, or just portions of a web site, in the alternate language on a regular basis. Crawling the web site in the alternate language is equivalent to a user viewing the site in the alternate language, and thus results in any new or modified pages being placed in the WebCATT 408 translation queue.
  • This technique is ideal for regularly scheduled updates to a web site, which normally happens after hours. For example, if the ABC Widgets web site modifies its sale offerings twice a week, such as on Mondays and Fridays at 12 AM, then the spider agent 404 can be scheduled to crawl the relevant parts of the site shortly after (e.g., at 12:30 AM) on those days. Around-the-clock translators can then translate the new sale banners so that the alternate language web site is up to date sometime later that morning.
  • the spider agent 404 can also be used to regularly (e.g., daily) crawl a web site even when changes are not regularly scheduled. This will guarantee that the alternate language site is in sync with the original language site after every crawl and subsequent translation.
  • Another way to take leverage the auto-scheduling function of the translation server 400 involves user access. Even if no manual quality assurance reviews or scheduled spider agent 404 crawls are performed, the alternate language web site is still automatically maintained up to date over the long term. This is because the first online user that attempts to view a new or modified page in the alternate language will trigger the placement of that page into the WebCATT translation queue. In that case, the online user will see the page in the original language or will see a partially translated page, depending on the amount of new content in the page and the pre-defined customer-specified translation threshold. However, subsequent users that access the page will see the web page in the alternate language after it has been translated.
  • the present invention also supports manual maintenance of the alternate language web site so as to be maintained in synchronization with the original site.
  • New information that needs translation can also be manually placed in the translation queue using WebCATT 408 .
  • This can be useful to translate large amounts of data that is available in advance of it being on the live web site 414 . For example, if the ABC Widgets web site updates its web site with new product offerings every Thursday morning and all product information is available by the previous Tuesday, then all new product data can be manually hatched into the translation queue using WebCATT 408 as soon as it is available so it is fully translated by the time the new web pages go live.
  • Population of the WebCATT 408 translation queue can be performed either by URL or by content.
  • Population by URL means that translation server 400 stores only the URL of the page in the queue. The content of the URL is retrieved afterwards when a translator accesses the page to translate it using WebCATT 408 .
  • Population by URL can present a problem if the content of the page is dependent on session information, such as a session ID present in a query parameter or stored in a cookie. In that case, the session ID in the query parameter may have expired or the session information stored in the cookie will not be present when viewing the page in WebCATT 408 . This is usually the case in shopping cart or account access pages.
  • Session dependent pages can be handled in two ways: (1) by replicating the session state via cookies and/or updated session parameters, or (2) by populating the page by content.
  • Replicating the session state means that the translator must manually re-acquire a session from the original site and then enter the session data in WebCATT 408 . Once the session data is entered it can be used for translating multiple pages.
  • Population by content means that translation server 400 stores the full content of the page in the queue. This avoids the session dependence issue, but can result in outdated content. As a result, population by content is only used for session dependent pages, and population by URL, which guarantees that the content being translated is the latest content, is used for all other pages.
  • Access to the WebCATT 408 translation queue is segmented by customer and prioritized. Pages to be translated are scheduled for translation on a priority basis using algorithms based on the percentage of the page already translated and how often the page is being accessed on the original web server while the page is in the translation queue. This allows the most important pages (i.e., most frequently accessed and those with smaller changes) to be translated first.
  • a file change detection feature can be used to deal with files whose names have been changed.
  • the translation server 400 and WebCATT 408 can match a file to be translated with its translated file by the URL of the original file. However, it is possible for a file to be changed while its name and location remain the same. In that case, it is possible that an outdated translated file is used for the translation.
  • the translation server 400 computes a hash-code or checksum based on the binary content of the file and stores it with the URL.
  • the translation server 400 re-computes the hash-code or checksum and compares it against the stored one. If they match, the file has not changed and the existing translated file can be used as replacement. However, if they do not match, the binary content of the file was changed and the existing file translation cannot be used. In that case, the page that contains the file is placed in the WebCATT 408 translation queue so the file may be re-translated.
  • FIG. 15 is an operational flow diagram depicting the synchronization process according to a preferred embodiment of the present invention.
  • the operational flow diagram of FIG. 15 depicts the automated maintenance process of the alternate language web site so as to be maintained in synchronization with the original web site 414 .
  • the operational flow diagram of FIG. 15 begins with step 1502 and flows directly to step 1504 .
  • a first content in a first language is retrieved from the web site 414 .
  • the first content in a first language is for translation into a second content in a second language.
  • the second web content is a human or machine translation in a second language of the first web content.
  • the first content is parsed into a plurality of translatable components.
  • a unique identifier is generated for each of the plurality of translatable components of the first content.
  • a unique identifier is generated using a hash code, a checksum or a mathematical algorithm.
  • a plurality of translated components of the second web content are identified or matched using the unique identifier of each of the plurality of translatable components of the first web content. If a translatable component of the first web content is not matched to a translated component of the second web content, in step 1512 , the translatable component is designated for translation into the second language. In optional step 1514 , the plurality of translatable components that did't matched are queued for human or machine translation into a second language. In optional step 1516 , the plurality of translatable components that did't matched are provided to WebCATT 408 for translation into a second language. In step 1518 , the control flow of FIG. 15 stops.
  • the present invention can be realized in hardware, software, or a combination of hardware and software.
  • a system according to a preferred embodiment of the present invention can be realized in a centralized fashion in one computer system, or in a distributed fashion where different elements are spread across several interconnected computer systems. Any kind of computer system—or other apparatus adapted for carrying out the methods described herein—is suited.
  • a typical combination of hardware and software could be a general-purpose computer system with a computer program that, when being loaded and executed, controls the computer system such that it carries out the methods described herein.
  • An embodiment of the present invention can also be embedded in a computer program product, which comprises all the features enabling the implementation of the methods described herein, and which—when loaded in a computer system—is able to carry out these methods.
  • Computer program means or computer program as used in the present invention indicates any expression, in any language, code or notation, of a set of instructions intended to cause a system having an information processing capability to perform a particular function either directly or after either or both of the following a) conversion to another language, code or, notation; and b) reproduction in a different material form.
  • a computer system may include, inter alia, one or more computers and at least a computer readable medium, allowing a computer system, to read data, instructions, messages or message packets, and other computer readable information from the computer readable medium.
  • the computer readable medium may include non-volatile memory, such as ROM, Flash memory, Disk drive memory, CD-ROM, and other permanent storage. Additionally, a computer readable medium may include, for example, volatile storage such as RAM, buffers, cache memory, and network circuits. Furthermore, the computer readable medium may comprise computer readable information in a transitory state medium such as a network link and/or a network interface, including a wired network or a wireless network, that allow a computer system to read such computer readable information.
  • FIG. 16 is a block diagram of a computer system useful for implementing an embodiment of the present invention.
  • the computer system includes one or more processors, such as processor 1604 .
  • the processor 1604 is connected to a communication infrastructure 1602 (e.g., a communications bus, cross-over bar, or network).
  • a communication infrastructure 1602 e.g., a communications bus, cross-over bar, or network.
  • the computer system can include a display interface 1608 that forwards graphics, text, and other data from the communication infrastructure 1602 (or from a frame buffer not shown) for display on the display unit 1610 .
  • the computer system also includes a main memory 1606 , preferably random access memory (RAM), and may also include a secondary memory 1612 .
  • the secondary memory 1612 may include, for example, a hard disk drive 1614 and/or a removable storage drive 1616 , representing a floppy disk drive, a magnetic tape drive, an optical disk drive, etc.
  • the removable storage drive 1616 reads from and/or writes to a removable storage unit 1618 in a manner well known to those having ordinary skill in the art.
  • Removable storage unit 1618 represents a floppy disk, magnetic tape, optical disk, etc. which is read by and written to by removable storage drive 1616 .
  • the removable storage unit 1618 includes a computer usable storage medium having stored therein computer software and/or data.
  • the secondary memory 1612 may include other similar means for allowing computer programs or other instructions to be loaded into the computer system.
  • Such means may include, for example, a removable storage unit 1622 and an interface 1620 .
  • Examples of such may include a program cartridge and cartridge interface (such as that found in video game devices), a removable memory chip (such as an EPROM, or PROM) and associated socket, and other removable storage units 1622 and interfaces 1620 which allow software and data to be transferred from the removable storage unit 1622 to the computer system.
  • the computer system may also include a communications interface 1624 .
  • Communications interface 1624 allows software and data to be transferred between the computer system and external devices. Examples of communications interface 1624 may include a modem, a network interface (such as an Ethernet card), a communications port, a PCMCIA slot and card, etc.
  • Software and data transferred via communications interface 1624 are in the form of signals which may be, for example, electronic, electromagnetic, optical, or other signals capable of being received by communications interface 1624 . These signals are provided to communications interface 1624 via a communications path (i.e., channel) 1626 .
  • This channel 1626 carries signals and may be implemented using wire or cable, fiber optics, a phone line, a cellular phone link, an RF link, and/or other communications channels.
  • the terms “computer program medium,” “computer usable medium,” and “computer readable medium” are used to generally refer to media such as main memory 1606 and secondary memory 1612 , removable storage drive 1616 , a hard disk installed in hard disk drive 1614 , and signals. These computer program products are means for providing software to the computer system.
  • the computer readable medium allows the computer system to read data, instructions, messages or message packets, and other computer readable information from the computer readable medium.
  • the computer readable medium may include non-volatile memory, such as Floppy, ROM, Flash memory, Disk drive memory, CD-ROM, and other permanent storage. It is useful, for example, for transporting information, such as data and computer instructions, between computer systems.
  • the computer readable medium may comprise computer readable information in a transitory state medium such as a network link and/or a network interface, including a wired network or a wireless network, that allow a computer to read such computer readable information.
  • Computer programs are stored in mam memory 1606 and/or secondary memory 1612 . Computer programs may also be received via communications interface 1624 . Such computer programs, when executed, enable the computer system to perform the features of the present invention as discussed herein. In particular, the computer programs, when executed, enable the processor 1604 to perform the features of the computer system. Accordingly, such computer programs represent controllers of the computer system.

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • Computational Linguistics (AREA)
  • General Health & Medical Sciences (AREA)
  • Health & Medical Sciences (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Artificial Intelligence (AREA)
  • Databases & Information Systems (AREA)
  • Data Mining & Analysis (AREA)
  • Computer Networks & Wireless Communication (AREA)
  • Signal Processing (AREA)
  • Probability & Statistics with Applications (AREA)
  • Information Transfer Between Computers (AREA)
  • Machine Translation (AREA)

Abstract

A system, method and computer readable medium for synchronizing web content is disclosed. The method includes retrieving a first web content in a first language from a web site, the first web content corresponding to a second web content wherein the second web content is a translation in a second language of the first web content. The method further includes dividing the first web content into a plurality of translatable components and generating a unique identifier for each of the plurality of translatable components. The method further includes matching each of the plurality of translatable components to a plurality of translated components of the second web content using the unique identifier of each of the plurality of translatable components. If a translatable component is not matched to a translated component, the method further includes designating the translatable component for translation into the second language.

Description

    CROSS-REFERENCE TO RELATED APPLICATIONS
  • This non-provisional application is a continuation of U.S. patent application Ser. No. 13/933,815 filed Jul. 2, 2013, which is a continuation of U.S. patent application Ser. No. 12/609,834, entitled “ANALYZING WEB SITE FOR TRANSLATION”, filed on Oct. 30, 2009, which is a continuation of U.S. patent application Ser. No. 10/784,334, entitled “ANALYZING WEB SITE FOR TRANSLATION”, filed on Feb. 23, 2004, which is a continuation-in-part of the provisional patent application Ser. No. 60/449,571 with inventors Enrique Travieso, Adam Rubenstein, and William Fleming and entitled “TRANSLATION SYSTEM ARCHITECTURE” filed Feb. 21, 2003, and commonly assigned herewith to Motionpoint Corporation, which is hereby incorporated by reference in its entirety. This non-provisional application is further related to non-provisional patent application Ser. No. 10/784,727 with inventors Enrique Travieso and Adam Rubenstein, entitled “DYNAMIC LANGUAGE TRANSLATION OF WEB SITE CONTENT;” to non-provisional patent application Ser. No. 10/784,726 with inventors Enrique Travieso, Adam Rubenstein, Arcadio Andrade and Collin Birdsey, entitled “AUTOMATION TOOL FOR WEB SITE CONTENT LANGUAGE TRANSLATION;” and to non-provisional patent application Ser. No. 10/784,868 with inventors Enrique Travieso, and Adam Rubenstein, entitled “SYNCHRONIZATION OF WEB SITE CONTENT BETWEEN LANGUAGES,” all of which were filed on Feb. 23, 2004. The entire teaching and contents of these related applications are hereby incorporated herein by reference.
  • FIELD OF THE INVENTION
  • The present invention generally relates to web sites, and more particularly relates to dynamic translation of web site content to another language.
  • BACKGROUND OF THE INVENTION
  • The Internet and the world-wide web has allowed consumers to complete business transactions with organizations located across continents from the comfort of their own desk. In an increasingly global marketplace, it is becoming imperative for organizations to provide web site content in multiple languages in order to expand their customer base beyond the organization's home country. In addition, as the demographics of a country change to include foreign language speakers, it is increasingly important to communicate with those customers and potential customers in their native language. For example, several large U.S. retailers have announced that serving the Hispanic segment is now a very high priority. Some U.S. retailers have even hired Hispanic ad agencies to start marketing to the Hispanic market in their native language—Spanish.
  • Currently, an organization that wants to translate its web site to another language can choose from several techniques, each having significant drawbacks. One technique involves purchasing machine translation technology. Machine translation is sometimes useful to get a rough idea of the meaning of the content in a web site, but it is far from ideal. For most organizations, this type of translation, although convenient, is not practical because the quality of the translation is simply not good enough to be posted on their web sites.
  • Another technique involves managing the translation process by deploying human translators and either maintaining multiple web sites for each language, or re-architecting the existing web site back-end technology to accommodate multiple languages. This requires significant resources in terms of time and cost, including a high level of complexity and duplication of effort. Dynamic and e-commerce sites present additional challenges, as the information to be translated resides in multiple places (e.g., a Structured Query Language database, static Hyper Text Markup Language pages and dynamic Hyper Text Markup Language page templates) and each translated site must interface with the same e-commerce or back-end engine. Further, as the web site changes, ongoing maintenance must also be handled. This approach will yield vastly superior translations that are suitable for professional web sites of large organizations, but at great cost. Most organizations simply do not have, or do not want to invest in, the resources necessary to handle this task internally.
  • For example, FIG. 1 is a block diagram illustrating the system architecture of a conventional web site. The web site of FIG. 1 is presented in a first language, such as English. FIG. 1 shows a web server 112 connected to the Internet 116 via a web connection. A public user 118, such as a person using a computer with a web connection, can access the web server 112 via the Internet 116 and download information, such as a web page 114, from the web server 112 for viewing. The web server 112 is operated by programming logic 110, consisting of instructions on how to retrieve, serve, and accept information for processing. The web server 112 further has access to a database 102 of information, as well as Hyper Text Markup Language (HTML) template files 104, graphics files 106 and multimedia files 108, all of which constitute the web site served by web server 112.
  • FIG. 2 is a block diagram illustrating the system architecture of a conventional web site presented in two languages. The web site of FIG. 2 is presented in a first language, such as English (as shown above for FIG. 1) and in a second language, such as Spanish. FIG. 2 shows the web server 112 and the other English language components described in FIG. 1, including the database 102 of information, the HTML template files 104, graphics files 106, multimedia files 108 and programming logic 110. FIG. 2 further shows the public user 118 accessing the web server 112 via the Internet 116 and downloading information, such as a web page 202 in the English or Spanish language.
  • FIG. 2 also shows the Spanish language components 204 of the web site, including the database 208 of information, the HTML template files 214, graphics files 216, multimedia files 210 and programming logic 212. The aforementioned Spanish language components are managed by a multi-lingual content manager 206, which manages requests for information in the dual languages. FIG. 2 further shows that the web server 112 must be re-engineered to serve multiple sets of content in different languages.
  • As can be seen in the difference between FIG. 1 and FIG. 2, the deployment of the Spanish language components 204 of FIG. 2 requires a significant expenditure of time and resources. Further, the deployment requires there-engineering of the web server 112, adding to the time and cost associated with the deployment. Additionally, once the Spanish language components 204 have been established, they must be kept synchronized with the English language components, resulting in a recurring cost. This is disadvantageous, as most organizations simply do not have the resources necessary to perform this task.
  • Therefore a need exists to overcome the problems with the prior art as discussed above.
  • SUMMARY OF THE INVENTION
  • Briefly, in accordance with the present invention, disclosed is a system, method and computer readable medium for synchronizing web content. In an embodiment of the present invention, the method on an information processing system includes retrieving a first web content in a first language from a web site, the first web content corresponding to a second web content wherein the second web content is a translation in a second language of the first web content. The method further includes dividing the first web content into a plurality of translatable components and generating a unique identifier for each of the plurality of translatable components of the first web content. The method further includes matching each of the plurality of translatable components of the first web content to a plurality of translated components of the second web content using the unique identifier of each of the plurality of translatable components of the first web content. If a translatable component of the first web content is not matched to a translated component of the second web content, the method further includes designating the translatable component of the first web content for translation into the second language.
  • Also disclosed is a web server for synchronizing web content. The web server includes a web connection for retrieving a first web content in a first language from a web site, the first web content corresponding to a second web content wherein the second web content is a translation in a second language of the first web content. The web server further includes a processor for dividing the first web content into a plurality of translatable components and generating a unique identifier for each of the plurality of translatable components of the first web content. The processor further for matching each of the plurality of translatable components of the first web content to a plurality of translated components of the second web content using the unique identifier of each of the plurality of translatable components of the first web content. If a translatable component of the first web content is not matched to a translated component of the second web content, the processor designates the translatable component of the first web content for translation into the second language.
  • Also disclosed is a computer program product including computer instructions for synchronizing web content. In an embodiment of the present invention, the computer instructions include instructions on an information processing system for retrieving a first web content in a first language from a web site, the first web content corresponding to a second web content wherein the second web content is a translation in a second language of the first web content. The computer instructions further include instructions for dividing the first web content into a plurality of translatable components and generating a unique identifier for each of the plurality of translatable components of the first web content. The computer instructions further include instructions for matching each of the plurality of translatable components of the first web content to a plurality of translated components of the second web content using the unique identifier of each of the plurality of translatable components of the first web content. If a translatable component of the first web content is not matched to a translated component of the second web content, the computer instructions further include instructions for designating the translatable component of the first web content for translation into the second language.
  • The preferred embodiments of the present invention are advantageous because of the ease of implementation of the disclosed systems. As discussed below, the present invention allows for the deployment of a corresponding web site in another language with a reduced amount of configuring of the original web site. This reduces the amount of information Technology (IT) resources that must be consumed by the providers of the original web site and reduces the amount of time necessary for deployment. Also as discussed below, only a single link is required to be deployed on the original web site in order to provide access to the corresponding web site in another language. This is beneficial as it reduces the amount of time and effort that must be expended by the providers of the original web site in order to release the corresponding web site in another language.
  • The present invention is further advantageous because it allows for the use of human translation, thereby producing a high quality translation of the original web site in another language. This is beneficial as it reduces or avoids the use machine translation, which can be of low quality. Additionally, the present invention preserves the formatting of the original web site, including when a translation is of a larger size or length that the original text. This is beneficial as it allows for the preservation of the look and feel of the original web site, thereby allowing users to maintain familiarity with the corresponding web site in another language.
  • The present invention is further advantageous because it supports large, complex and rapidly-changing web sites. As explained in greater detail below, the present invention supports web sites with any number of web pages, links, downloads and other materials, thereby allowing for greater flexibility and usability of the present invention. The present invention also supports web sites that change continuously or periodically, as it regularly polls the web site to discern changes and initiate corresponding translations. This is beneficial as it reduces the amount of time and effort that is expended on the maintenance of a corresponding web site in another language.
  • The present invention is further advantageous because it provides a corresponding web site in a second language, thereby meeting the needs of customers speaking the second language. This is beneficial as it generates traffic consisting of customers speaking the second language and provides customers speaking the second language a self-service e-commerce option. This is also beneficial because it provides more accessible shopping opportunities for customers in the second language and provides a more user-friendly environment for these clients in the second language.
  • BRIEF DESCRIPTION OF THE DRAWINGS
  • FIG. 1 is a block diagram illustrating the system architecture of a conventional web site.
  • FIG. 2 is a block diagram illustrating the system architecture of a conventional web site presented in two languages.
  • FIG. 3 is a block diagram illustrating the system architecture of a web site presented in two languages, in one embodiment of the present invention.
  • FIG. 4 is a block diagram illustrating the system architecture of the present invention, in one embodiment of the present invention.
  • FIG. 5 is an operational flow diagram depicting the process of the translation server, according to a preferred embodiment of the present invention.
  • FIGS. 6A-6C illustrate an operational flow diagram depicting the serving process of the translation server, according to a preferred embodiment of the present invention.
  • FIG. 7 is a block diagram depicting the serving process in an ASP model of the translation server, according to a preferred embodiment of the present invention.
  • FIG. 8 is a block diagram depicting the serving process in a web service model of the translation server, according to a preferred embodiment of the present invention.
  • FIG. 9 is a screenshot of a WebCATT interface used for viewing a translatable component, in one embodiment of the present invention.
  • FIG. 10 is a screenshot of a WebCATT interface used for viewing a translatable component along with a corresponding translation, in one embodiment of the present invention.
  • FIG. 11 is a screenshot of a WebCATT interface used for editing a translatable component, in one embodiment of the present invention.
  • FIG. 12 is a screenshot of a WebCATT interface used for viewing a translation queue, in one embodiment of the present invention.
  • FIG. 13 is an operational flow diagram depicting the process of WebCATT, according to a preferred embodiment of the present invention.
  • FIG. 14 is an operational flow diagram depicting the process of the spider, according to a preferred embodiment of the present invention.
  • FIG. 15 is an operational flow diagram depicting the synchronization process according to a preferred embodiment of the present invention.
  • FIG. 16 is a block diagram showing a computer system useful for implementing the present invention.
  • DETAILED DESCRIPTION
  • The present invention, according to a preferred embodiment, overcomes problems with the prior art by providing an efficient and easy-to-implement system and method for dynamic language translation of a web site.
  • OVERVIEW
  • FIG. 3 is a block diagram illustrating the system architecture of a web site presented in two languages, in one embodiment of the present invention. The web site of FIG. 3 is presented in a first language, such as English, and a second language, such as Spanish. FIG. 3 shows the web server 112 of FIG. 1 connected to the Internet 116 via a web connection. Also as shown in FIG. 1, a public user 118 accesses the web server 112 via the Internet 116 and download information, such as a web page, from the web server 112 for viewing. The user 118 utilizes a client application, such as a web browser, on his client computer to connect to the web site of via the network 116. Once connected to the web site, the user 118 browses through the products or services offered by the web site by navigating through its web pages.
  • The web server 112 is operated by programming logic 110 and the web server 112 further has access to a database 102 of information, as well as HTML template files 104, graphics files 106 and multimedia files 108, all of which constitute the English components of the web site served by web server 112.
  • FIG. 3 further shows translation server 300 situated apart from and existing independently from the web server 112. The translation server 300 embodies the main functions of the present invention, including the provision of a web site in a secondary language, such as Spanish. The translation server 300 provides the secondary language components of a base web site, which is provided by web server 112, without requiring integration with the base web site or re-configuring or re-engineering of the web server 112.
  • As can be seen in the difference between FIG. 2 and FIG. 3, the deployment of the secondary language components FIG. 3 requires a significantly reduced expenditure of time and resources than the deployment of FIG. 2. Further, the deployment of FIG. 3 does not require the re-engineering of the web server 112. Additionally, once the secondary language components have been established by the translation server 300, they are automatically kept synchronized with the English language components of the base web site. Thus, the system of the present invention is advantageous as it reduces the amount of time, effort and resources that are required to deploy a secondary language web site.
  • FIG. 4 is a block diagram illustrating the system architecture of the present invention, in one embodiment of the present invention. FIG. 4 presents an alternative point of view of the system architecture of the present invention. FIG. 4 shows a web site 414 representing a web site in a first language such as English that is connected to the Internet 412 via a web connection. FIG. 4 further shows a user 416 that utilizes a web connection to the Internet 412 to browse and navigate the web pages served by the web site 414.
  • FIG. 4 further shows the translation server 400, corresponding to the translation server 300 of FIG. 3, and a translation database 406 for use by the translation server 400 in storing of translatable components during the serving of web pages in a secondary language such as Spanish. This process is described in greater detail below. Also shown in FIG. 4 is the Web Computer Aided Translation Tool (WebCATT), which is a tool for aiding a human 418 or an admin 410 in translating the components of a web site in a first language. Further shown is a spider 404 for use in analyzing and sizing a web site 414. The translation server 400, and WebCATT tool 408 are connected to a web server 402, which is the conduit through which all web actions of the above tools are channeled. The translation server 400, WebCATT tool 408 are described in greater detail below.
  • In an embodiment of the present invention, the computer systems of translation server 400, WebCATT tool 408, spider 404 and web server 402 are one or more Personal Computers (PCs) (e.g., IBM or compatible PC workstations running the Microsoft Windows 95/98/2000/ME/CE/NT/XP operating system, Unix, Linux, Macintosh computers running the Mac OS operating system, or equivalent), Personal Digital Assistants (PDAs), game consoles or any other information processing devices. In another embodiment of the present invention, the computer systems of translation server 400, WebCATT tool 408, spider 404 and web server 402 are server systems (e.g., SUN Ultra workstations running the SunOS operating system or IBM RS/6000 workstations and servers running the AIX operating system).
  • In one embodiment of the present invention, Internet network 412 is a circuit switched network, such as the Public Service Telephone Network (PSTN). In another embodiment of the present invention, the network 412 is a packet switched network. The packet switched network is a wide area network (WAN), such as the global Internet, a private WAN, a local area network (LAN), a telecommunications network or any combination of the above-mentioned networks. In another embodiment of the present invention, network 412 is a wired network, a wireless network, a broadcast network or a point-to-point network.
  • Translation Server
  • INTRODUCTION
  • The translation server 400 is the back-end application responsible for the conversion of web pages to another language. The translation server 400 parses each incoming HTML page into translatable components, substitutes each incoming translatable component with an appropriate translated component, and returns the translated web page back to the online user 416. Page conversion is performed on the fly each time an online user 416 requests a page in the second or alternate language. When a web page is received for conversion, the translation server 400 will translate the page if enough translated content is available to meet a customer specified translation threshold. If this is not the case, then the page will be returned in the first or original language.
  • A translatable component includes any one of a text segment, an image file with text to be translated, a multimedia file with text or audio to be translated, a file with text to be translated, a file with image with text to be translated, a file with audio to be translated and a file with video with at least one of text and audio to be translated.
  • The page conversion process follows seven major steps. In a first step, for each text segment encountered, if a translation is available it replaces it with the translated text segment. If no translation is available, either the text remains in the original language or a machine translation is performed on the fly, depending on the customer's preference. In a second step, for each linked file (images, PDF files, Flash movies, etc.) encountered, if a translated file is available the HTML link tag is rewritten so that it points to the translated file. If a translated file is not available, the original link tag is left untouched. In a third step, any relative Universal Resource Locator (URL) found in the page is converted to an absolute URL. This is necessary because the browser resolves relative URLs based on the URL of the current page. In the case of a translated page, the URL of the page is actually in the translation server 400. As a result, the browser would request all files and links with relative URLs from the translation server 400, which is not the correct original location.
  • In a fourth step, each JavaScript block is parsed for directive tags that indicate text content to translate. Images are automatically detected by recognition of the file extension. Script tags that reference external JavaScript files are rewritten so that they are redirected to the translation server 400. They are then parsed and translated in a separate browser Hyper Text Transfer Protocol (HTTP) request. In a fifth step, each link to another web page is rewritten so that the original URL is redirected to the translation server 400. When an online user clicks on a rewritten link, the request then goes directly to the translation server 400 and the page is in turn translated. Links to other web pages placed in JavaScript blocks are automatically recognized, either by extension or by pre-defined customer specific URL patterns, and also rewritten for redirection. This feature, which keeps the user in the alternate language as they browse the site, is called “implicit navigation”.
  • In a sixth step, for each directive tag or attribute found, the appropriate instruction is performed. In a seventh step, the translation server 400 automatically schedules the web page for translation by placing it in the WebCATT 408 translation queue, in the event a translation cannot be found for one or more text segments or linked files in the page.
  • FIG. 5 is an operational flow diagram depicting the process of the translation server 400, according to a preferred embodiment of the present invention. The operational flow diagram of FIG. 5 depicts the process of the translation server 400 of responding to a user request for a web page in a secondary language. The operational flow diagram of FIG. 5 begins with step 502 and flows directly to step 504.
  • In step 504, the translation server 400 receives a request from a user 416 on a web site 414, the web site 414 having a first web content in a first language such as English. The request, such as an HTTP request or a Simple Mail Transfer Protocol (SMTP) request, calls for a second web content in a second language such as Spanish. The second web content is a human or machine translation in a second language of the first web content. The first language includes any one of English, French, Spanish, German, Portuguese, Italian, Japanese, Chinese, Korean, and Arabic and the second language is different than the first language and includes any one of English, French, Spanish, German, Portuguese, Italian, Japanese, Chinese, Korean, and Arabic.
  • In step 506, the translation server 400 retrieves the first web content from the web site 414. In step 508, the translation server 400 divides the first web content into a plurality of translatable components. In step 510, the translation server 400 generates a unique identifier for each of the plurality of translatable components of the first web content. For a text segment, the translation server 400 can generate a unique identifier using a hash code, a checksum or a mathematical algorithm.
  • In step 512, the translation server 400 identifies a plurality of translated components of the second web content using the unique identifier of each of the plurality of translatable components of the first web content. In step 514, the translation server 400 arranges or puts the plurality of translated components of the second web content to preserve a format that corresponds to the first web content. The translation server 400 can arrange or put the plurality of translated components of the second web content to preserve a format that corresponds to the first web content, including putting formatting tags that are not visible in the first web content. In step 516, the translation server 400 provides the second web content in response to the request that was received. In step 518, the control flow of FIG. 5 stops.
  • FIGS. 6A-6C illustrate an operational flow diagram depicting the serving process of the translation server 400, according to a preferred embodiment of the present invention. The operational flow diagram of FIGS. 6A-6C depicts the process of the translation server 400 of providing a web page in a secondary language in response to a user request. Specifically, the operational flow diagram of FIGS. 6A-6C provides more detail with regards to steps 508-514 of FIG. 5 above. The operational flow diagram of FIGS. 6A-6C begins with step 601 and flows directly to step 602.
  • Step 601 begins with a source HTML page or first web content of step 506 of FIG. 5. In step 602, at least one portion of the first web content is parsed into a translatable component. In step 603, it is determined whether the end of the file of the first web content is reached. If the result of the determination is affirmative, then control flows to step 612. Otherwise, control flows to step 604. In step 604, it is determined whether the translatable component that was parsed in step 602 is a text segment. If the result of the determination is affirmative, then control flows to step 605. Otherwise, control flows to step 614.
  • In step 605, a hash code or other unique identifier is computed for the text segment. In step 606, using the unique identifier, a matching translated text segment is looked up in a cache. In step 607, it is determined whether the matching translated text segment is found in the cache. If the result of the determination is affirmative, then control flows to step 608. Otherwise, control flows to step 618. In step 608, it is determined whether there was multiple matching translated text segments found in the cache. If the result of the determination is affirmative, then control flows to step 620. Otherwise, control flows to step 609. In step 620, the correct translated segment is determined using the sequence constraints and a character by character comparison. In step 609, it is determined whether translation of the text segment is suppressed or not yet translated. If the result of the determination is affirmative, then control flows to step 621. Otherwise, control flows to step 610.
  • In step 610, the matching translated text segment is set as a target segment. In step 621, the current text segment is set as the target segment. In step 640, the target segment is added to the output web content, or second web content (i.e., the translated HTML page or the output HTML page). In step 623, the second web content is output for provision to the user requesting the web page.
  • In step 612, it is determined whether there is an incomplete translation of the current web page, i.e., the first web content. If the result of the determination is affirmative, then control flows to step 613. Otherwise, control flows to step 611. In step 613, the current web page is scheduled for translation. In step 611, the translation activity performed by the translation server 400 in servicing the current web page is recorded in the translation database 406. In step 625, it is determined whether the percentage of the current web page, i.e., the first web content, is translated is above a threshold. If the result of the determination is affirmative, then control flows to step 624. Otherwise, control flows to step 626. In step 624, the second web content or translated HTML page is output for provision to the user requesting the web page. In step 626, the current web page or first web content is output unchanged for provision to the user requesting the web page.
  • In step 614, it is determined whether the translatable component parsed in step 602 is a translatable file such as a PDF file, an image file, etc. If the result of the determination is affirmative, then control flows to step 615. Otherwise, control flows to step 629. In step 629, it is determined whether the translatable component parsed in step 602 is a link to another translatable page. If the result of the determination is affirmative, then control flows to step 628. Otherwise, control flows to step 627. In step 627, a tag is added to the translated HTML page to indicate a link (this is described in greater detail below). In step 628, the link is modified to redirect the URL (this is described in greater detail below).
  • In step 615, a translated file corresponding to the translatable file is looked up in a cache. In step 616, it is determined whether the translated file was found. If the result of the determination is affirmative, then control flows to step 617. Otherwise, control flows to step 633. In step 633, the translated file is looked up in the translation database 406. In step 635, it is determined whether the translated file was found. If the result of the determination is affirmative, then control flows to step 634. Otherwise, control flows to step 632. In step 634, the translated file that was found is stored in the cache. In step 632, an incomplete translation is recorded in the translation database 406. In step 630, the original web page is set as the target file. In step 631, the target file is added to the translated HTML page.
  • In step 617, it is determined whether translation is suppressed for the translatable file. If the result of the determination is affirmative, then control flows to step 630. Otherwise, control flows to step 636. In step 636, the translated file is set as the target file. In step 618, the using the unique identifier, a matching translated text segment is looked up in the translation database 406. In step 622, it is determined whether the matching translated text segment is found in the database. If the result of the determination is affirmative, then control flows to step 619. Otherwise, control flows to step 637. In step 619, the translated segment that was found is stored in the cache. In step 637, an incomplete translation is recorded in the translation database 406.
  • In step 638, it is determined whether a machine translation of the text segment can be performed. If the result of the determination is affirmative, then control flows to step 639. Otherwise, control flows to step 621. In step 639, the machine translation is set as the target segment.
  • ASP Model
  • The translation server 400 can be presented in a variety of models. In the Application Service Provider (ASP) model, the translation server 400 converts full web pages or script files at a time and delivers them directly to the online user 416. Under this model, the links in a web page are rewritten so that the request is redirected to the translation server 400. For example, the URL of the translation server 400 for a fictional customer called ABC Widgets is defined as: http://trans1.motionpoint.net/abcwidgets/enes/
  • Then the link <a href=“http://www.abcwidgets.com> would be rewritten as follows: <a href=“http://trans1.motionpoint.net/abcwidgets/enes/?24; http://www.abcwidgets.com”>
  • Clicking on the above rewritten link results in the browser request being sent to the translation server 400. The translation server 400 in turns reads the original URL passed in the query string (i.e., everything after the question mark), requests the page from the ABC Widgets server, converts it to the alternate language, and sends it back to the user 416.
  • FIG. 7 is a block diagram depicting the serving process in an ASP model of the translation server 400, according to a preferred embodiment of the present invention. In a first step 702, the user 416 clicks on a link of a web page in a first language on the web site 414. The link points to a page to be translated. The translation server 400 receives the request and processes it. In a second step 704, the translation server 400 forwards the request to the web site 414 and in a third step 706, the web site 414 provides the page to the translation server 400 for translation. In a fourth step 708, the translation server 400 translates the page using the translations in the translation database 406 and sends the translated page to the user 416.
  • Web Service Model
  • In the web service model, the translated content is not delivered directly to the online user 416. Instead the customer's web site server 414 issues the request for translation to the translation server 400, which acts as a web translation service. Under this model, the translation server 400 can convert full pages or just specific text segments and/or files. When directly translating text segments or files, multiple translation requests can be issued, one per segment or file, or multiple segments and files can be translated in a single hatched request.
  • FIG. 8 is a block diagram depicting the serving process in a web service model of the translation server 400, according to a preferred embodiment of the present invention. In a first step 802, the user 416 clicks on a link of a web page in a first language on the web site 414. The link points to a page to be translated. The web site server 414 receives the request and processes it. In a second step 804, the web site 414 provides the page to the translation server 400 for translation. In a third step 806, the translation server 400 provides the translated page to the web site 414. In a fourth step 808, the web site 414 sends the translated page to the user 416.
  • Hosting and Management
  • The hosting and management model defines who deploys and manages the hardware and operating system software in which the software components of the present invention reside. There are two hosting and management models: hosted & managed, and managed only. Alternately, the software can be licensed directly to the customer and the customer is responsible for both the hosting and management.
  • The hosted and managed model is a fully outsourced model in which one entity hosts the service and all translated data. Under this model, one entity deploys the translation server 400 and WebCATT 408 software on its own hardware. All hardware and software is provisioned and maintained by this entity, so the customer web site 414 has no responsibility for any hardware or software related to the service. In this model, the hosting entity is responsible for: 1) provisioning, installing, configuring and maintaining all hardware, including communication to the Internet 412, 2) installing, configuring and maintaining all operating system, web server and database server software, 3) installing, configuring and managing on an ongoing basis the translation server 400 and WebCATT 408 software and 4) maintaining staff and subcontractors that use the WebCATT 408 software to perform the translations that maintain the alternate language site in sync with the original language site.
  • In the managed only model, the translation server 400 and WebCATT 408 software are installed on the customer web site's hardware. In this model the customer web site 414 is responsible for: 1) provisioning, installing, configuring and maintaining all hardware, including communication to the Internet 412, 2) installing, configuring and maintaining all operating system, web server and database server software. The managing entity responsible for: 1) installing, configuring and managing on an ongoing basis the translation server 400 and WebCATT 408 software, 2) maintaining staff and subcontractors that use the WebCATT 408 software to perform the translations that maintain the alternate language site in sync with the original language site.
  • Dedicated vs. Shared Servers
  • The components of the present invention can be deployed in dedicated or shared server environments. In a shared environment multiple customer web sites share the same hardware. In a typical scenario, multiple translation servers 400 are installed in the same web server 402, which connects to a database server containing the database 406 of translated data. A single WebCATT 408 software installation may is also shared by multiple customers. This setup is cost efficient and can be used for small and medium size sites with low-to-moderate web site traffic.
  • In a dedicated environment all hardware is dedicated to one customer web site 414. This is necessary for large organizations with heavy web site traffic and large amounts of text to be translated. In this case, either a single web server 402 or a cluster of web servers is dedicated to the customer. The database server is also normally dedicated to the customer. Dedicated servers assure guaranteed bandwidth for the customer and simplify keeping track of bandwidth usage for management and billing purposes.
  • Parsing & Translation
  • The system of the present invention does not save or maintain translated pages. Although, this may be useful for sites with static content, it becomes unmanageable for sites whose content is generated dynamically from database information in response to a user's request. Instead, the present invention stores only those components within a web page that require translation, i.e., translatable components.
  • Parsing is the process of breaking-up an HTML page submitted for translation into its translatable and non-translatable components. Non-translatable components simply pass through the system unchanged (except for URLs that need rewriting). Translatable components are processed and replaced by their translated counterparts if available. There are generally two types of translatable components in a web page: text segments and files. A translatable component includes any one of a text segment, an image file with text to be translated, a multimedia file with text or audio to be translated, a file with text to be translated, a file with image with to be translated, a file with audio to be translated and a file with video with at least one of text and audio to be translated.
  • A text segment is a chunk of text on the page as defined by the HTML that surrounds it. A text segment can range from a single word to a paragraph or multiple paragraphs. A file is any type of external content that resides on a file, is linked from within the page, and may require translation. Typical types of linked files found in web pages are images, PDF files, MS Word documents and Flash movies.
  • Below is an example of a very simple HTML page:
  • <html><head><title>Widget Product Information</title></head>
    <body>Widget <b>Model#123</b>
    <p> This widget is very useful for many chores around the house.
    <p><img src=“img/widgetpicture.gif” alt=“Product photo”>
    <p><a href=“http://www.abcwidgets.com”>Click here to return to the home page</a></body></html>
  • The above example page would by default be parsed into the following six text segments: 1) ‘Widget Product Information’, 2) ‘Widget’, 3) ‘Model#123’, 4) ‘This widget is very useful for many chores around the house.’, 5) ‘Product photo’, 6) ‘Click here to return to the home page’. The above example page would further be parsed into the following one file: img/widgetpicture.gif.
  • By default the parsing system breaks-up text segments according to the HTML tags in the page. In the above example, the sentence ‘Widget Model#123’ was broken up into two segments because there was an HTML bold tag (<b>) in the middle of it. However, the parsing system is flexible and allows defining, on per-customer basis, which HTML tags are formatting tags that should not break up text segments. So if we define the bold tag as a formatting tag, then the example page would instead be parsed into the following five text segments: 1) ‘Widget Product Information’, 2) ‘Widget <b>Model#123</b>’, 3) ‘This widget is very useful for many chores around the house.’, 4) ‘Product photo’, 5) ‘Click here to return to the home page’.
  • The bold tags now became part of the second text segment, allowing the translator to place them in the correct location in the alternate language. For example, translating the text segment ‘Widget <b>Model#123</b>’ to Spanish will result in flipping the order of the ‘Widget’ and ‘Model’ words within the sentence. Since the bold tag is part of the text segment it can be moved so it still holds the word ‘Model’, as shown: <b>Modelo No. 123</b> de Artefacto
  • Below is an example of how the example page is converted to Spanish by the translation server 400:
  • <html><head><title>Informacion del Artefacto</title></head>
    <body>
    <b>Modelo No. 123</b>del Artefacto
    <p>Este artefacto es muy util para todo tipo de trabajos en la casa.
    <p><img src=“http://www.trans1.motionpoint.netlimg/abcwidgets/ES_24.gif alt=“Foto del Producto”><p><a href=““http://trans1.motionpoint.netlabcwidgets/enes/”?24;http://www.abcwidgets.com”>Haga clic aqui para regresar ala pagina principal</a></body><lhtml>
  • In order to convert the page, the translation server 400 performed several changes to the page. Each text segment was replaced with a corresponding translation. It is important to note that the text of the image description (‘Product photo’) placed in the ‘alt’ attribute of the image tag was recognized as a text segment and translated. The translation server 400 can recognize text segments inside attributes of HTML tags, such as the text in buttons of a form.
  • Further, the URL of the image tag was replaced to point to a translated image file. The translation server 400 only executes this action if a translated file has been defined (since many images do not have text and thus do not require translation), otherwise it does not change the URL of the image (except to make the URL absolute if it is not). In this example it is assumed that the ‘ES_24.gif’ image file was defined in WebCATT 408 as the translation for the ‘widgetpicture.gif’ file.
  • The URL of the home page link was rewritten from ‘http://www.abcwidgets.com’ to ‘http://trans1.motionpoint.netlabcwidgets/enes/?24; http://www.abcwidgets.com’ in order to redirect it to the translation server 400. This is done so when the online user clicks on the ‘Click here to return to the home page’ link, the request will go directly to the translation server 400 and the home page will also be translated. This process is called implicit navigation and it is explained in more detail below.
  • Implicit Navigation
  • Implicit navigation is a translation server 400 feature that keeps an online user 416 in the alternate language as he/she browses a web site. Implicit navigation is implemented by rewriting the URLs in the applicable links inside a page as the page is being translated, so they are redirected to the translation server 400. As a result, not only is the page translated, but also all applicable links to other translated pages within the page are modified so that when the consumer clicks on the linked page it will also be automatically translated.
  • To rewrite a link, the translation server 400 prefixes the original URL with the URL of the translation server 400, so the original URL becomes the query string to the translation server 400 URL. When a rewritten link is clicked, the request goes to the translation server 400, which reads the query string to obtain the original URL to be translated and requests the page to be translated from this URL. The translation server 400 then converts the page received to the alternate language and delivers the translated page to the consumer directly.
  • When a link is rewritten, the original URL is only one part of the query string. The other part of the query string is a special numeric action ID, which provides information about the type of conversion request being performed.
  • The following describes some supported base action IDs. “1” indicates no action. “2” indicates pages that were not translated, or for which the translation did not meet the minimum translation percentage, and therefore should not be returned. “4” indicates HTML to be translated is being submitted as POST data when processing a POST request. If this action is not specified, then the URL passed in the query string is accessed in order to obtain the HTML to be translated. “8” indicates that all relative URLs in the HTML should be converted to absolute URLs. This is necessary only in GET requests. If relative URLs are not used in the document, this action should not be specified. “16” indicates implicit navigation is enabled. “32” indicates the request includes cookie data to be passed back as cookies to all URLs to be translated.
  • “64” indicates that all links in the page are to be disabled. This overrides action ID “16” if also specified. “128” indicates translation of the page is to be disabled. This is used to process tags without affecting content. “8192” indicates a translation is being requested from WebCATT 408 for previewing. The translation server 400 adds special HTML tags to the web page to allow highlighting translated as opposed to not-translated segments, disabling links to other pages, adding alternate language hover preview features, and allowing editing a segment or file by clicking on it in the preview page.
  • Actions may be combined by using the sum of the IDs as the action ID. For example, the following illustrates how implicit navigation is performed on a link of a fictional online retailer ABC Widgets: <a href=“http://www.abcwidgets.com/product_listing.jsp? category=TV”>See TV Products Listing</a>
  • In order to translate the listing page to Spanish, the link is rewritten as follows: <a href=“http://trans1.motionpoint.netlabcwidgets/enes/?24; http://www.abcwidgets.com/product_listing.jsp?category=TV”>See TV Products Listing</a>
  • In the above example the original URL is: http://www.abcwidgets.com/product_listing.jsp? category=TV
  • The translation server 400 URL is: http://trans1.motionpoint.net/abcwidgets/enes/
  • And the action ID is “24”, which means to enable implicit navigation and to convert relative URLs to absolute.
  • The scope of implicit navigation can be pre-defined by domain and/or URL patterns. In a typical scenario, only pages being served from a specific domain(s) should be translated. In the ABC Widgets example, if the implicit navigation domains are defined as abcwidgets.com and abcwidgets.net, then only URLs within those two domains will be rewritten. If a more granular translation is required, such as when translating only a part of a web site, then URL patterns can be used. For example, if ABC Widgets wishes not to translate the careers and investor relations sections of their site, then the following two example Exclude URL patterns could be used: 1) abcwidgets.com/careers/and 2) abcwidgets.com/investor/
  • Any URLs for pages residing within the above two paths would not be rewritten and thus never translated. On the other hand, if ABC Widgets wishes only to translate its online product catalog, then the following example Include URL pattern could be used: abcwidgets.com/catalog/
  • In that case, only pages residing within the abcwidgets.com/catalog/ path are rewritten and thus translated. Include and Exclude URL patterns may be combined to better define the scope of the translation. Implicit navigation can also be controlled from within the HTML to be translated through the use of directive tags or directive attributes. These are explained in detail in below.
  • E-Commerce Database Language Enabling
  • The system of the present invention enables users to access the same original language e-commerce database in multiple languages. Since the translation server 400 processes web pages after they have left the customer web site 414, but before they reach the user 416, it does not affect a web server's e-commerce technology. As a result, the same web site 414 can be accessed in multiple languages, and all users are accessing the same e-commerce database simultaneously.
  • For example, an auction web site can allow users in different countries to bid on the same item. Each user can view the site and bid on the item in his native language. Since all bids from the different countries are actually hitting the same web site and the same e-commerce engine, all bids occur in real time and each user can see in real-time what all the other users in all other countries are bidding.
  • Text Segment Matching
  • When looking up a suitable translation for a text segment in an HTML page, a character-by-character comparison of the text in the segment against a database 406 of stored text segments is not ideal because it is very time consuming. As a result, in one embodiment of the present invention, the translation server 400 computes three 64-bit numeric hash codes from each incoming text segment. The hash code function is optimized to spread the resulting hash code across the full range of 64-bit numeric values (−9223372036854775808 to 9223372036854775807).
  • The three hash codes are computed as follows: 1) hash code 1 is based on all characters in the segment, 2) hash code 2 is based on the odd characters in the segment and 3) hash code 3 is based on the even characters in the segment. By distributing the hash code computations in this manner, the chances of key collisions are drastically reduced. The three computed hash codes make a composite key that represents each text segment in the memory caches and in the database. In the unlikely event that multiple text segments are represented by the same composite key, the translation server 400 will then resort to a character-by-character match.
  • Text Segment Locking
  • Occasionally, the meaning of a word or phrase may change depending on the context in which it's being used. It is also possible that the translation itself may vary depending on the context or placement of a text segment, even if the original meaning does not change. As a result, it may be necessary to specify multiple translations for the same word or phrase, one for each usage context. The text segment locking feature allows translators to do this by providing the ability to “lock” translated text segments together. When two or more translation text segments are locked together they are used only when the exact translation sequence is followed.
  • For example, the translation to Spanish of the text segment “Virtual Brochures” can vary, depending on where it is used. Below is this segment used in an English HTML sentence: <b>Virtual Brochures</b> are great. The corresponding translation to Spanish is: <b>Los Folletos Virtuales</b>son excelentes. Another example of a segment used in an English HTML sentence: There are many great <b>Virtual Brochures</b>. The corresponding translation to Spanish is: Hay muchos excelentes <b>Folletos Virtuales</b>
  • For this example, we assume that the HTML bold (<b>) tag is not defined as a formatting tag and, therefore, forces each sentence above to be broken up into two text segments each. As a result, the phrase “Virtual Brochures” becomes a separate text segment that requires a different translation for each case. Using the text segment locking feature in WebCATT 408, the translator locks the “Los Folletos Virtuales” translated segment with the “son excelentes” translated segment in the first sentence, and the “Hay muchos excelentes” translated segment with the “Folletos Virtuales” translated segment in the second sentence.
  • At conversion time, when the translation server 400 encounters the “Virtual Brochures” segment in the first sentence it looks up a corresponding translated segment and gets back two potential matches: “Los Folletos Virtuales” and “Folletos Virtuales”. It then proceeds to look up a translated segment for the next segment “are great” and gets back “son excelentes”. Since “son excelentes” is locked to “Los Folletos Virtuales”, the translation server 400 is able to determine that “Los Folletos Virtuales” is the correct translation to the previous segment “Virtual Brochures”.
  • Form Posting
  • The translation server 400 transparently handles form submissions via GET or POST methods. This means that all form data is forwarded to the original URL that processes the form and that the response page is converted to the alternate language. The first step in the form handling is performed when an HTML page that has a form in it is being converted.
  • If the form is submitted via POST method, then the translation server 400 simply rewrites the URL in the ACTION attribute of the <FORM> tag. This is done by prefixing the original URL with the URL of the translation server 400, so the original URL becomes the query string to the translation server 400 URL, much like the implicit navigation feature in standard links. The browser will perform the POST request to the translation server 400, which will read the query string to obtain the original URL where the form is to be submitted and perform the POST to that URL, forwarding it all form data. The translation server 400 then reads the response page, converts it to the alternate language, and delivers the translated page to the user directly.
  • If the form is submitted via the GET method, then the translation server 400 cannot simply rewrite the URL in the ACTION attribute of the <FORM> tag because in a GET method the form data is sent in the query string. As a result, the browser would replace the original URL with the form data and the translation server 400 would not know to what URL to submit the form data. To overcome this limitation, the translation server 400 adds a hidden field to the form whose value contains the original URL, and replaces the URL in the ACTION attribute of the <FORM> tag so the request is sent to the translation server 400. The browser will perform the GET submission to the translation server 400, which will read the value of the hidden form field to obtain the original URL where the form is to be submitted and perform the GET submission to that URL, forwarding it all form data. The translation server 400 then reads the response page, converts it to the alternate language, and delivers the translated page to the consumer directly.
  • JavaScript/VBScript Handling
  • The translation server 400 is capable of translating text segments and files located inside JavaScript or VBScript code. Common types of files can be recognized automatically by their standard extensions. The translation server 400 parses all JavaScript code blocks and replaces the URLs of all files for which a translation exists so it points to the translated file. Non-standard file extensions and URL patterns may be defined on a per-customer basis to allow the translation server 400 to recognize less common or proprietary file formats, or even dynamically generated files. File recognition and translation can also be controlled from within the JavaScript code through the use of directive tags. These are explained in detail below. Text segments inside script code that require translation must be explicitly identified by placing a set of directive tags around the text.
  • Translation of content inside JavaScript or VBScript include files is also supported. A script include file is downloaded by the browser in a separate HTTP request and included in the web page as if it had appeared within the page. Include files are handled in the same manner as implicit navigation in standard links within the page. The URL of the include file is rewritten so the original include file is prefixed with the URL of the translation server 400 and the original file URL becomes the query string to the translation server 400 URL. The browser will then request the include file from the translation server 400, which will read the query string to obtain the URL of the original include file and request it from its location. The translation server 400 then reads the file, performs the appropriate conversions, and delivers the modified file to the browser for inclusion in the web page.
  • JavaScript include files are specified using the source (src) attribute in the <SCRIPT> tag, as shown: <scriptlanguage=“javascript” src=“menu.js”></script>
  • Shown is an example of how the above script tag is rewritten so the content inside the JavaScript include file is translated: <script language=“javascript” src=“http://trans1.motionpoint.netlabcwidgets/enes/?24;http://www.abcwidgets.com/menu.js”></script>
  • Directive Tags and Attributes
  • Directive tags and directive attributes are special HTML tags and attributes that allow more granular control over the translation and implicit navigation within in a web page. Directive tags are special HTML comments tags that are ignored by the browser, but provide specific instructions to the translation server 400. Directive attributes are specially named attributes placed within an HTML tag that are also ignored by the browser, but provide specific instructions to the translation server 400 that apply only to the tag in which the attribute is placed.
  • Translation control tags and attributes are used to specify sections on a web page that should not get translated. One important use of translation control tags is to delimit personal information, such as a persons name, address, credit card numbers, etc. that may show up in a web page, but which should not be processed—it simply passes through the translation server 400 without being translated or stored-for security and privacy issues.
  • Following is an exemplary list of directive tags. The directive tag “mp_trans_partial_start & mp_trans_partial_end” signals the start and end of a partial translation section. This tag may be used at the top of a web page in conjunction with section translate tags to selectively translate sections of a page. The directive tag “mp_trans_enable_start & mp_trans_enable_end” signals the start and end of a section to be translated within a partial translation section. All text and files within this section are translated. The directive tag “mp_trans_disable_start & mp_trans_disable_end” signals the start and end of a section not to be translated when in normal translation mode. The directive tag “mp_trans_machine_start & mp_trans_machine_end” signals that any text segments enclosed within the tags may be machine translated in the event that a human translation is not available.
  • Following is a list of directive attributes. The directive attribute “mpdistrans” disables translation of a file or of translatable text in a tag, such as alt, keywords or description meta-tag, or form buttons.
  • Below is an example of usage of translation control directive tags and attributes:
  • <html><head>
    <meta name=“description” content=“This page description is translated”>
    <meta mpdistrans name=“keywords” content=“These keywords are not translated, keyword1, keyword2, keyword3, keyword4, keyword5”>
    <title> This title is translated</title></head><body>
  • This text and the image widget1.gifbelow are translated.
  • <img src=“img/widget1.gif alt=“This image description is translated”>
    <p><img mpdistrans src=“img/widget2.gif alt=“This image and this description are NOT translated because of the mpdistrans attribute”>
    <!-- mp_trans_disable_start -->
  • This text and the image widget3.gifbelow are NOT translated because they are inside a translation disabled section.
  • <img src=“img/widget3.gif.gif alt=“This image description is NOT translated”>
    <!-- mp_trans_disable_end --> This text is translated.
    <!-- mptrans_partial_start --> This text is NOT translated because it is inside a partially translated section and not specifically designated as translatable content.
    <!-- mp_trans_enable_start --> This text is translated because it is inside a partially translated section and it is specifically designated as translatable content.
    <!-- mp_trans_enable_end --> This text is NOT translated because it is inside a partially translated section and not specifically designated as translatable content.
    <!-- mp_trans_partial_end --> This text is translated.</body></html>
  • Following is a list of directive attributes for implicit navigation control. The directive attribute “mpnav” enables implicit navigation for listed attributes in the tag. This attribute can be used for tags that do not normally contain URLs, but do. The directive attribute “mpdisnav” disables implicit navigation for all attributes or only listed attributes of the tag. The directive attribute “mporgnav” forces original navigation for all attributes or only listed attributes of the tag. Original navigation will remove redirection to the translation server if found, otherwise it will leave the link intact. This directive attribute is discussed below with reference to one-link deployment.
  • Below is an example of usage of implicit navigation control directive attributes.
  • <html><body>ABC Widgets Home Page
    <p><a href=“widgets.jsp”>See all useful widgets</a>
    <p><a mpdisnav href=“uselesswidgets.jsp>See useless widgets</a>
    <p><form action=“showwidget.jsp” method=“post”><select name=“WidgetSel”>
    <option value SELECTED>Select a widget to view:</option>
    <option mpnav=“value” value=“widget1.jsp”>Widget 1</option>
    <option mpnav=“value” value=“widget2.jsp”>Widget 2</option>
    </select></form></body></html>
  • The translation server 400 would process the above page as follows:
  • <html><body>Pagina Principal de ABC Widgets
    <a href=“http://trans1.motionpoint.netlabcwidgets/enes/?24;
    http://www 0 abcwidgets.com/widgets.jsp”>ver artefactos utiles</a>
    <p><a mpdisnav href=“uselesswidgets.jsp>Ver artefactos inutiles </a>
    <p><form action=“http://trans1.motionpoint.net/abcwidgets/enes/?24;
    http://www.abcwidgets.com/showwidget.jsp” method=“post”>
    <select name=“WidgetSel”>
    <option value SELECTED>Escoga un artefacto para verlo:</option>
    <option mpnav=“value”
    value=“http://trans1.motionpoint.netlabcwidgets/enes/?24;http://www.abcwidgets.com/widget1.jsp”>Artefacto 1</option>
    <option mpnav=“value”
    value=“http://trans1.motionpoint.netlabcwidgets/enes/?24;http://www.abcwidgets.com/widget2.jsp”>Artefacto 2</option>
    </select></form></body></html>
  • It can be seen above that implicit navigation was not performed for the anchor (<A>) tag with the mpdisnav attribute. As a result, when the user clicks on the ‘Ver artefactos inutiles’ link, the uselesswidgets.jsp web page is not redirected to the translation server 400 and therefore it is not translated. Furthermore, the mpnav attribute placed in the two <OPTION> tags instructed the translation server 400 to perform implicit navigation on the URL specified in the value attribute of each tag.
  • Following is a list of directive tags for JavaScriptNBScript control. The directive tag “mp_trans_textjs_start & mp_trans_textjs_end” signals the start and end of a section inside a script block that contains text to be translated. The directive tag “mp_trans_imgjs_start & mptrans_imgjs_end” signals the start and end of a section inside a script block that contains images, PDF, Flash or other files to be translated. Under most circumstances these tags are not needed as the translation server 400 JavaScript parser can automatically recognize common types of files by their standard extensions.
  • The directive tag “mp_trans_supressurljs_start & mp_trans_supressurljs_end” signals the start and end of a section inside a script block that inhibits the processing of URLs. URLs are processed for implicit navigation, or to convert relative URLs to absolute URLs if implicit navigation is disabled. This tag may be necessary to avoid processing portions of URLs that are used to build up a final URL by means of concatenation.
  • Below is an example of usage of script control directive tags:
  • <script language=“Javascript”><!--function CheckLoginForm( )
    {  <!-- mp_trans_textjs_start -->
    var usermsg =“User name is required\n”;
    var pswdmsg =“Password is required\n”;
    var hdrmsg =“Please correct the following errors:\n”;
    <!-- mp_trans_textjs_end --> var message=“”;
    if(document.LoginForm.login_user.value = “”) {
    message=message + usermsg;  }
    if(document.LoginForm.login_pass.value = “”) {
    message=message + pswdmsg;  }
    if(message = “”) {
    document.LoginForm.submit( );  } else {
    message= hdrmsg +message; alert(message);}}//--></script>
  • The above CheckLoginForm function verifies that an online user has entered a login name and password before posting the LoginForm form in the page. If a user has not entered the required information, then a pop-up alert box shows an error message with details. The text of the various error messages is assigned to variables and enclosed in a set of ‘mp_trans_textjs’ directive tags so it can be recognized and translated.
  • “One-Link” Deployment
  • One of the primary goals of the TransMotion system is to eliminate or minimize the workload of a customer web site's IT department in order to deploy an alternate language web site. The one-link deployment feature allows a customer to deploy the alternate language web site by simply placing one language-switching link in the home page of the original language site.
  • The one-link deployment is a combination of two features: (1) automatic flipping of the language-switching link, and (2) implicit navigation to maintain the user in the alternate language.
  • Automatic flipping of the language-switching link is specified by using the mporgnav directive attribute in the language-switching link. The mporgnav directive attribute instructs the translation server 400 to rewrite the URL to support automatic language switching.
  • Below is an example of a very simple home page:
  • <html><body>Welcome to the ABC Widgets Home Page
    <p><a href=“widgets.jsp”>Click here to see all widgets we sell</a>
    </body></html>
  • In order to deploy a mirror Spanish language web site all that has to be done is place one link in the home page that redirects the home page to ABC Widget's translation server 400. Below is an example of the above home page with the new language-switching link added:
  • <html><body>Welcome to the ABC Widgets Home Page<p>
    <a mporgnav href=“http://trans1.motionpoint.net/abcwidgets/enes/?24;
    http://www.abcwidgets.com”>Click here to see this site in Spanish</a>
    <p><a href=“widgets.jsp”>Click here to see all widgets we sell</a>
    </body></html>
  • When a user clicks the ‘Click here to see this site in Spanish’ language-switching link, the translation server 400 returns the home page translated, as shown below:
  • <html><body>Bienvenidos ala Pagina Principal de ABC Widgets<p>
    <a mporgnav href=“http://www.abcwidgets.com”>Haga clic aqui para ver este sitio web en Ingles</a><p>
    <a href=“http://trans1.motionpoint.netlabcwidgets/enes/?24;
    http://www.abcwidgets.com/widgets.jsp”>Haga clic aqui para ver todos los artefactos que vendemos</a></body></html>
  • As shown above, in addition to translating the page, the translation server 400 also rewrites the URL in the language-switching link and performs implicit navigation of all other URLs in the page. The translation server 400 rewrites the URL in the language-switching link so that the translation server 400 redirection is removed. The mporgnav directive attribute is used to instruct the translation server 400 to do this. In addition, the link text ‘Click here to see this site in Spanish’ is translated as ‘Haga clic aqui para ver este sitio web en Ingles’ (which means ‘Click here to see this site in English’). This automatic and simultaneous change of both the URL and the text (or image) in the language-switching link by the translation server 400 is what allows the user to flip back-and-forth between English and Spanish.
  • Implicit navigation is also performed in all the links on the page. In the above example home page, it was performed on the widgets.jsp page. As a result, when a user clicks on this rewritten link, the widgets.jsp page is in turn translated and implicit navigation performed on all of its links within the abcwidgets.com domain. This process is repeated so that the user is always navigating the site in the alternate language.
  • Customized Content
  • The translation server 400 allows delivering customized content according to the language and/or locale that a user is viewing the site in. When the translation server 400 requests a web page for translation, it sends two cookies to the original web server called ‘mptranslan’ and ‘mptranscty’. The value of the ‘mptranslan’ cookie is a 2 or 3-letter (upper-case) language code in compliance with the ISO 639 standard. The value of the ‘mptranscty’ cookie is a 2-letter (upper-case) country code in compliance with the ISO 3166 standard.
  • Web site server software can determine if a page is being viewed in an alternate language and/or a different country by checking for these cookies. For example, by checking that the ‘mptranslan’ cookie exists, and that its value is ‘ES’, a web server can determine that a page is being served in Spanish and customize the content being served, such as showcasing items that appeal more to Hispanics. In addition, if a company maintains operations in multiple countries, then it can use the ‘mptranscty’ cookie to determine the country and show only products sold or shipped to that country.
  • Internal Search Engine Integration
  • When an online user 416 that is viewing a web site 414 in an alternate language performs an internal site search, it is natural for the user to enter the search keyword(s) in the alternate language. When the translation server 400 forwards the search keyword(s) to the original web site, the search engine will not be able to find any matching results, or might deliver incorrect results. This occurs because the web server search engine is matching the keyword(s) in the alternate language against a search index of keywords that are in the original language.
  • The translation server 400 provides an elegant solution to this problem by performing a real-time reverse machine translation on the search keyword(s) and forwarding the keyword(s) to the web server search engine in the original language. Reverse machine translation is configured so it is performed only on the specific keyword field(s) of the search form(s) in a web site.
  • Internet Search Engine Compatibility
  • The system of the present invention is compatible with all Internet search engines, such as Google or AltaVista. These search engines utilize content from both the body and head of the HTML document to index a web page. To ensure transparent compatibility with Internet search engines, the system of the present invention translates all applicable text in the head of the document. This includes the page title, the page description meta-tag, and the keywords meta-tag.
  • Integration with Machine Translation
  • The translation server 400 can use real-time machine translation in the event that a human translation is not yet available for a text segment. This an optional setting that can be specified per-customer, per-URL pattern and/or by means of directive tags.
  • Efficient Caching
  • Caching frequently used data in memory is necessary to minimize round trips to the database 406. There are two types of caches being used: dynamic and static. A dynamic cache is one whose entries are removed from the cache when memory becomes scarce, and use a Most-Recently-Used (MRU) algorithm to keep the most relevant entries in the cache. The use of an MRU algorithm to manage the cache guarantees that the most frequently accessed and most recently used entries are always in the cache. This type of cache is used for large, long-lived caches.
  • In a static cache, entries cannot be removed automatically when memory becomes scarce. This type of cache is normally used for small, short-lived caches, but is also used for long-lived caches that will not grow too large and whose entries must remain in the cache. The translation server 400 contains five memory caches, which are described in more detail below.
  • A main segment cache is a dynamic long-lived cache that stores ACTIVE translated text segments keyed by the composite key derived from the original (not yet translated) text segment's 64-bit hash codes. This allows a quick lookup of translation text. Segments are removed from this cache if they are deactivated in the WebCATT 408. A translation queue segment cache is a dynamic long-lived cache that stores the text segments of all pages that are in the translation queue. This allows the translation server 400 to determine that a specific text segment that has not yet been translated is already in the queue for translation without having to search the database. Segments are removed from this cache when they are activated in the WebCATT 408.
  • A main file cache is a dynamic long-lived cache that stores ACTIVE files keyed by their names. This allows the quick lookup of a translated file. Files are removed from this cache if they are deactivated in the WebCATT 408. A translation queue file cache is a dynamic long-lived cache that stores the files of all pages that are in the translation queue. This allows the translation server 400 to determine that a file that has not yet been translated is already in the queue for translation without having to search the database. Files are removed from this cache when they are activated in the WebCATT 408.
  • A translation queue page cache is a static long-lived cache that stores all pages that are in the translation queue. This allows the translation server 400 to determine that a page that has not yet been translated is already in the queue for translation without having to search the database. A 64-bit hash code is used to determine if a page in the queue has changed and has to be re-scheduled for translation. Pages are removed from this cache when they are activated in the WebCATT 408.
  • The translation server 400 is advantageous as it does not require IT integration with an existing web site infrastructure. The present invention converts the outbound HTML stream after it has left the client web server 414. Thus, there is no need to re-architect an existing web site or build a separate web site for alternate language. Further, there is no client storage or management of translated data required. Translated data is managed and maintained by the WebCATT 408 software outside of the wed site's database.
  • The translation server 400 is further advantageous as it works with any client web server hardware and software technology infrastructure. Further, it allows for evolution of the existing client's hardware and software technology infrastructure. Moreover, deployment of the present invention requires minimal effort as a reduced amount of client IT resources are required. The one-link deployment feature involves the client placing one link on the web site 414 to provide access to the alternate language web site. Therefore, deployment is rapid and cost effective.
  • WebCATT
  • The WebCATT (Web Computer Aided Translation Tool) 408 is a web based Graphical User Interface (GUI) application that is used to perform and manage human translations. The tool is built specifically for web (HTML) page translations. It can be used by professional translators to translate web site translatable components and by managers to manage the translation process. Since WebCATT 408 is a web-based application that is accessed via the Internet 412, translators and managers can be located in different geographical areas.
  • WebCATT 408 is similar to other computer aided translation tools used by professional translation service organizations. WebCATT 408 supports localization, text recognition, fuzzy matching, translation memory, internal repetitions, alignment, and a glossary/terminology database. WebCATT 408 is designed for web site translation and includes other features optimized for web translation, such as What You See Is What You Get (WYSIWYG) HTML previewing and support for image/graphic translation.
  • WebCATT 408 organizes the translation workload into web pages. A web page is the HTML content generated by a specific URL address, regardless of whether that content is static (i.e., physically resides in the web server in a file with a html extension), or dynamic (i.e., the content is generated dynamically by combining information from a database and HTML templates). Dynamic pages that are dependent on session information (i.e., a shopping cart checkout page) are also supported.
  • Within a web page there are two types of units of translation that translators work with: text segments, and files. A text segment is a chunk of text on the page as defined by the HTML that surrounds it. A text segment can range from a single word to a paragraph or multiple paragraphs. A file is any type of external content that resides on a file, is linked from within the page, and may require translation. Typical types of files found in web pages are images, PDF files, MS Word documents and Flash movies. A file is translated by uploading a replacement file that has all text and/or sounds translated.
  • FIG. 9 is a screenshot of a WebCATT interface used for viewing a translatable component, in one embodiment of the present invention. FIG. 9 shows a display area 902 in which a web page including translatable component in a first language (in this case, English) is displayed. Also shown in FIG. 9 is a section 904 including information associated with the web page displayed in display area 902, such as page status, page URL, page ID, etc. Further shown in FIG. 9 is a section 906 including statistics associated with the web site from which the displayed web page is garnered, such as the number of files translated, the number of segments translated, the number of translations suppressed, etc.
  • FIG. 10 is a screenshot of a WebCATT interface used for viewing a translatable component along with a corresponding translation, in one embodiment of the present invention. FIG. 10 shows a display area 1002 in which an original image file translatable component is displayed in a first language (in this case, English). FIG. 10 shows a display area 1004 in which a translated image file is displayed in a second language (in this case, Spanish). Also shown in FIG. 10 is a section 1006 including information associated with the file displayed in display areas 1002-1004, such as file status, file URL, file ID, etc. FIG. 10 shows how WebCATT 408 allows a user to view a translatable component alongside a corresponding translated component for comparison.
  • FIG. 11 is a screenshot of a WebCATT interface used for editing a translatable component, in one embodiment of the present invention. FIG. 11 shows a display area 1102 in which a web page including a translated component in a second language (in this case, Spanish) is displayed. The display area 1102 provides a WYSIWYG web page preview feature that allows viewing the translated web page as it is being translated. Translations can often result in a significant amount of word growth (e.g., approx. 20% from English to Spanish) or shrinkage, which can result in carefully formatted web page layouts being knocked out of alignment by the longer text. The WYSIWYG page preview feature allows translators to immediately see the translated web pages and quickly make adjustments in word choice in order to maintain the correct alignment and layout of the page when translated.
  • Also shown in FIG. 11 is a section 1104 including information associated with the web page displayed in display area 1102, such as page status, page URL, page ID, etc. Further shown in FIG. 11 is a section 1106 including statistics associated with the web site from which the displayed web page is garnered, such as the number of files translated, the number of segments translated, the number of translations suppressed, etc. In addition to each of those statistics, a breakdown of translated and not translated components is shown in both units and percentages.
  • A section 1110 provides a text segment edit form that allows a translator to edit text segments in the order they appear on the page. This form features a fuzzy search feature that automatically shows and sorts existing segment matches in the database. The translator can copy an existing translation from the search results area to use as a starting translation.
  • A section 1108 provides a file list form that allows a translator to preview all linked files on the page. The list form allows the translator to select all files that do not require translation (e.g., an image with no text) and quickly tag them as such. It also allows a translator to select individual files for translation via the file edit form. File translation involves uploading a translated file and translating the file text description if present.
  • The GUI of FIG. 11 allows a user to view the plurality of translated components placed into the format derived from the first, or source, content, thereby enabling a user to review how the translated components are rendered in the first content format. The GUI of FIG. 11 further allows a user to highlight any of the plurality of translatable components, which are not yet translated, differently from translated components when previewing the plurality of translated components in the first content format. The GUI of FIG. 11 further allows a user to display text when hovering over a translated component so as to view the first content corresponding to the translated component.
  • The GUI of FIG. 11 further allows a user to select at least one of the translated components when previewing the plurality of translated components in the first content format so as to edit the translated component and store the translated component that has been revised with the corresponding unique identifier. Lastly, the GUI of FIG. 11 further allows previewing in a multi-user environment so that more than one user can simultaneously view translated components rendered in the first content format.
  • WebCATT 408 also provides complete management of the translation process. Web pages are scheduled for translation either automatically by the translation server 400, or manually by a manager via upload of web pages or other type of content to be translated. When a web page is scheduled for translation it is placed in the translation queue of a specific customer. Pages to be translated are scheduled for translation on a priority basis using algorithms based on the percentage of the page already translated and how often the page is being accessed on the original web server while it's in the translation queue. This allows the most important pages (i.e., most frequently accessed and those with smaller changes) to be translated first.
  • Once pages are in the queue, a manager can assign them for translation to a specific translator or translation service subcontractor. If assigned to a subcontractor, a subcontractor manager can then assign them to specific translators within the subcontractor organization or even to freelancers that work with them. Proofers can also be assigned. A subcontractor can assign its own proofers to pages and managers can also assign proofers to check the work of translators or subcontractors.
  • A web page must go through a series of status changes before it is available via the Internet. A page can have any of the following statuses: NEW, IN-PRODUCTION, and ACTIVE. When a page is placed in the queue its status is NEW. When a translator first accesses the page for the purpose of translating it, its status is changed to IN-PRODUCTION. After the page is fully translated and proofed, then a manager changes its status to ACTIVE. Only ACTIVE pages available via the Internet.
  • In addition to the page statuses, the text and files within the page maintain their own translation status. The status for text segments and files is maintained both at the page level (i.e., one single overall status for all segments in the page and another one for all files in the page) and individually. A text segment or file can have any of the following statuses: NEW, TRANSLATED, CONTRACTOR_PROOFED, PROOFED and ACTIVE. The initial status is NEW. After a translator translates the text or file the status is changed to IN-PRODUCTION. When the translation is proofed by a subcontractor proofer the status is changed to CONTRACTOR_PROOFED and when it is proofed by an internal proofer the status is changed to PROOFED. Finally the manager changes the status to ACTIVE. A page can only be activated after all segments and files within it are ACTIVE.
  • FIG. 12 is a screenshot of a WebCATT interface used for viewing a translation queue, in one embodiment of the present invention. FIG. 12 shows a series of columns wherein a unit of information is provided for each page of the web site 414 listed on each row. FIG. 12 shows a first column 1202 including unique page identifiers. Column 1204 includes a URL for each page. Column 1206 includes receipt data for each page. Column 1208 includes a percentage statistic indicating the percentage of the page that has been translated. Column 1210 indicates a status for each page. Column 1212 indicates the contractor assigned to the page.
  • FIG. 13 is an operational flow diagram depicting the process of WebCATT 408, according to a preferred embodiment of the present invention. The operational flow diagram of FIG. 13 depicts the process by which WebCATT 408, which provides a web based tool for managing language translations of content, queues and translates components of a web site 414. The operational flow diagram of FIG. 13 begins with step 1302 and flows directly to step 1304.
  • In step 1304, WebCATT 408 retrieves a first content, or HTML source page, in a first language from the web site 414. In step 1306, WebCATT 408 parses the first content into a plurality of translatable components. In step 1308, WebCATT 408 generates a unique identifier for each of the plurality of translatable components of the first content. In step 1310, WebCATT 408 queues the plurality of translatable components and corresponding unique identifiers for human or machine translation into a second language.
  • In step 1312, for each of the plurality of translatable components, WebCATT 408 stores a translated component and an associated unique identifier corresponding to the translatable component, thereby storing a plurality of translated components and corresponding unique identifiers.
  • In step 1314, WebCATT 408 provides the plurality of translatable components and corresponding unique identifiers to a third party for human translation into a second language. In step 1316, the control flow of FIG. 13 stops.
  • WebCATT 408 is advantageous as it allows translators to work directly with live pages off the web site 414 being translated. Thus, the client web site 414 need not send information to the translation server 400 for translation. Furthermore, all web pages in a web site are automatically entered into the translation work queue by the WebCATT 408 spider 404, described in greater detail below.
  • WebCATT 408 is further advantageous as WYSIWYG preview allows translators to see translated web pages, as they would appear on the live web site. This allows the translator to compensate for word growth or shrinkage that knocks a web page layout out of alignment.
  • Furthermore, a translated preview page is marked-up with special HTML & JavaScript to allow: 1) color coding of all text in the web page so the translator can see what is already translated, what remains to be translated and where the current text segment is located within the page, 2) clicking in text or a file to take the translator to a form to edit the translation for the text or file and 3) hovering the mouse over a text or file to pop up a window showing the original wording or file.
  • WebCATT 408 is further advantageous as pages are parsed into its translatable components and translators only work with these components, not a complex group of HTML files. All HTML and script code is hidden when using WebCATT 408. WebCATT 408 is further beneficial as it can be utilized via the ASP model and translators can access it via the web. Translated pages can be delivered via the translation server 400 or saved as static html pages to be sent to client, wherein links among pages are modified so they reference the translated pages.
  • WebCATT 408 is further beneficial as it allows management of the translation process. Multiple user access levels are supported: managers, proofers, translators & sub-contractors. Mangers can assign work in the translation queue to translators, proofers and/or subcontractors. Subcontractor managers can in turn sub-assign work to subcontractor translators and proofers. Managers must activate web pages before the translation server 400 can deliver them.
  • TransScope
  • A spider is a program that visits web sites and reads their pages and other information in order to create entries for an index such as a search engine index. For example, the major search engines on the Internet all have such a program, which is also known as a “crawler” or a “bot.” Spiders are typically programmed to visit web sites that have been submitted by their owners as new or updated. Entire web sites or specific pages can be selectively visited and indexed. Spiders are called spiders because they usually visit many web sites in parallel at the same time, their “legs” spanning a large area of the “web.” Spiders can crawl through a web site's pages in several ways.
  • One way a spider can crawl through a web site is to follow all the hypertext links in each page until all the pages have been read. The spiders for the major search engines on the Internet adhere to the rules of politeness for Web spiders that are specified in a standard for robot exclusion. This standard asks each server which files should be excluded from being indexed. It does not (or can not) go through a firewall. The standard also proscribes a special algorithm for waiting between successive server requests so that the spider doesn't affect web site response time for other users.
  • The operations of a spider are in contrast with a normal web browser operated by a human that doesn't automatically follow links other than inline images and URL redirection. The algorithm used by spiders to pick which references to follow strongly depends on the spider's purpose. Index-building spiders usually retrieve a significant proportion of the references. The other extreme is spiders that try to validate the references in a set of documents. These spiders usually do not retrieve any of the links apart from redirections.
  • FIG. 4 shows a spider 404 for use in analyzing and sizing a web site 414. The spider 404 is a tool that crawls specific web sites and performs any of a variety of actions. The spider 404 can crawl a web site in order to populate the WebCATT translation queue with new or updated information. The spider 404 may also gather content statistics that can be used to provide a monetary quote for deployment of the present invention.
  • FIG. 14 is an operational flow diagram depicting the process of spider 404, according to a preferred embodiment of the present invention. The operational flow diagram of FIG. 14 depicts the process by which spider 404, which provides a web based tool for sizing a web site for language translation, retrieves and indexes translatable components of a web site 414. The operational flow diagram of FIG. 14 begins with step 1402 and flows directly to step 1404.
  • In step 1404, spider 404 retrieves a first content, or HTML source page, in a first language from the web site 414. The first content in a first language is for translation into a second content in a second language. The second web content is a human or machine translation in a second language of the first web content. In step 1406, spider 404 parses the first content into a plurality of translatable components. A translatable component includes any one of a text segment, an image file with text to be translated, a multimedia file with text or audio to be translated, a file with text to be translated, a file with image with to be translated, a file with audio to be translated and a file with video with at least one of text and audio to be translated.
  • In step 1408, spider 404 generates a unique identifier for each of the plurality of translatable components of the first content. For a text segment, the translation server 400 can generate a unique identifier using a hash code, a checksum or a mathematical algorithm. In step 1410, spider 404 stores the plurality of translatable components and corresponding unique identifiers in the database 406 for human or machine translation into the second language.
  • In optional step 1412, spider 404 queues the plurality of translatable components and corresponding unique identifiers for human or machine translation into a second language. In optional step 1414, spider 404 provides the plurality of translatable components and corresponding unique identifiers to WebCATT 408 for human translation into a second language. In step 1416, spider 404 generates statistics based on the translatable components retrieved from the web site 414. The statistics generated include a file count, a page count, a translatable segment count, a unique text segment count, a unique text segment word count and a word count. The spider 404 can further generate a web page having a link to each file of the web site 414. In step 1418, the control flow of FIG. 14 stops.
  • The spider 404 can be pre-configured for each customer web site so that the use of directive tags and/or attributes is eliminated or minimized. This minimizes the workload of the customer web site's IT personnel. Further, the spider 404 can be separately pre-defined by domain and/or by URL pattern. This allows specifying sections of a web site to be translated without the need for placing directive tags in each web page.
  • The spider 404 is advantageous as it can be used to update the WebCATT 408 translation work queue. Further, spider 404 can be used to gather statistics about a web site 414 in order to allow estimating the amount of work involved in translating the web site and pricing accordingly.
  • Spider 404 can summarize word counts, segment counts, file counts and page counts of a web site 414. The spider 404 is further efficient and supplements the functions of WebCATT 408 as it works to save all unique text segments and file URLs in the database 406 for later translation into a second language. It can further create an HTML page containing links to all files of web site 414, so the files can reviewed for translation at a later time.
  • The spider 404 is efficient in navigating a crawling a web site 414 as it can emulate a browser by saving and returning cookies. Spider 404 can further fill out and submit forms with pre-defined information and is able to establish a session and normalize session ID parameters for e-commerce sites. Spider 404 can further be configured to crawl only specific areas of a web site by defining include/exclude domains and URL patterns. Spider 404 can also be configured to send specific HTTP headers, such as the user-agent (i.e., type of browser). Spider 404 can be executed in a single computer or in distributed mode. In distributed mode, multiple machines work in conjunction to crawl the same web site simultaneously sharing the same database 406.
  • TransSync
  • Most web sites are continuously updated with new information, but maintaining an alternate language web site up to date presents a challenge when using traditional methods. The system of the present invention provides an elegant solution to this problem by providing various methods to maintain an alternate language web site up to date.
  • Automatic maintenance involves automated maintenance of the alternate language web site so as to be maintained in synchronization with the original site with no human intervention or little additional effort. Automatic maintenance is based on a function of the translation server 400. Specifically, the function wherein the translation server 400 automatically schedules a web page for translation by placing it in the WebCATT 408 translation queue (described in more detail above) in the event a translation cannot be found for one or more text segments or linked files in the page. Thus, the act of viewing a never-before translated or a modified page in the alternate language enables the scheduling of the web page for translation.
  • There are several ways to take leverage the auto-scheduling function of the translation server 400. One way involves manual quality assurance review. If a new web page or an updated web page goes through a manual quality assurance process that involves a person reviewing the page before it is released to the live web site, then the quality assurance personnel simply attempts to view the page in the alternate language during the review process. This will place the new web page in the WebCATT 408 translation queue for translation before the page goes into the production (live) web site. General Information and Policy type web pages are good candidates for this process.
  • Another way to take leverage the auto-scheduling function of the translation server 400 involves the spider agent 404. In the case of web pages that do not undergo an individual quality assurance review before going into production, the spider agent 404 can be used to crawl a web site, or just portions of a web site, in the alternate language on a regular basis. Crawling the web site in the alternate language is equivalent to a user viewing the site in the alternate language, and thus results in any new or modified pages being placed in the WebCATT 408 translation queue.
  • This technique is ideal for regularly scheduled updates to a web site, which normally happens after hours. For example, if the ABC Widgets web site modifies its sale offerings twice a week, such as on Mondays and Fridays at 12 AM, then the spider agent 404 can be scheduled to crawl the relevant parts of the site shortly after (e.g., at 12:30 AM) on those days. Around-the-clock translators can then translate the new sale banners so that the alternate language web site is up to date sometime later that morning.
  • The spider agent 404 can also be used to regularly (e.g., daily) crawl a web site even when changes are not regularly scheduled. This will guarantee that the alternate language site is in sync with the original language site after every crawl and subsequent translation.
  • Another way to take leverage the auto-scheduling function of the translation server 400 involves user access. Even if no manual quality assurance reviews or scheduled spider agent 404 crawls are performed, the alternate language web site is still automatically maintained up to date over the long term. This is because the first online user that attempts to view a new or modified page in the alternate language will trigger the placement of that page into the WebCATT translation queue. In that case, the online user will see the page in the original language or will see a partially translated page, depending on the amount of new content in the page and the pre-defined customer-specified translation threshold. However, subsequent users that access the page will see the web page in the alternate language after it has been translated.
  • In addition to automatic maintenance, the present invention also supports manual maintenance of the alternate language web site so as to be maintained in synchronization with the original site. New information that needs translation can also be manually placed in the translation queue using WebCATT 408. This can be useful to translate large amounts of data that is available in advance of it being on the live web site 414. For example, if the ABC Widgets web site updates its web site with new product offerings every Thursday morning and all product information is available by the previous Tuesday, then all new product data can be manually hatched into the translation queue using WebCATT 408 as soon as it is available so it is fully translated by the time the new web pages go live.
  • Population of the WebCATT 408 translation queue can be performed either by URL or by content. Population by URL means that translation server 400 stores only the URL of the page in the queue. The content of the URL is retrieved afterwards when a translator accesses the page to translate it using WebCATT 408. Population by URL can present a problem if the content of the page is dependent on session information, such as a session ID present in a query parameter or stored in a cookie. In that case, the session ID in the query parameter may have expired or the session information stored in the cookie will not be present when viewing the page in WebCATT 408. This is usually the case in shopping cart or account access pages.
  • Session dependent pages can be handled in two ways: (1) by replicating the session state via cookies and/or updated session parameters, or (2) by populating the page by content. Replicating the session state means that the translator must manually re-acquire a session from the original site and then enter the session data in WebCATT 408. Once the session data is entered it can be used for translating multiple pages. Population by content means that translation server 400 stores the full content of the page in the queue. This avoids the session dependence issue, but can result in outdated content. As a result, population by content is only used for session dependent pages, and population by URL, which guarantees that the content being translated is the latest content, is used for all other pages.
  • Access to the WebCATT 408 translation queue is segmented by customer and prioritized. Pages to be translated are scheduled for translation on a priority basis using algorithms based on the percentage of the page already translated and how often the page is being accessed on the original web server while the page is in the translation queue. This allows the most important pages (i.e., most frequently accessed and those with smaller changes) to be translated first.
  • A file change detection feature can be used to deal with files whose names have been changed. The translation server 400 and WebCATT 408 can match a file to be translated with its translated file by the URL of the original file. However, it is possible for a file to be changed while its name and location remain the same. In that case, it is possible that an outdated translated file is used for the translation.
  • To overcome this issue, the translation server 400 computes a hash-code or checksum based on the binary content of the file and stores it with the URL. At conversion time, each time a file is presented for translation the translation server 400 re-computes the hash-code or checksum and compares it against the stored one. If they match, the file has not changed and the existing translated file can be used as replacement. However, if they do not match, the binary content of the file was changed and the existing file translation cannot be used. In that case, the page that contains the file is placed in the WebCATT 408 translation queue so the file may be re-translated.
  • FIG. 15 is an operational flow diagram depicting the synchronization process according to a preferred embodiment of the present invention. The operational flow diagram of FIG. 15 depicts the automated maintenance process of the alternate language web site so as to be maintained in synchronization with the original web site 414. The operational flow diagram of FIG. 15 begins with step 1502 and flows directly to step 1504.
  • In step 1504, a first content in a first language, or HTML source page, is retrieved from the web site 414. The first content in a first language is for translation into a second content in a second language. The second web content is a human or machine translation in a second language of the first web content. In step 1506, the first content is parsed into a plurality of translatable components.
  • In step 1508, a unique identifier is generated for each of the plurality of translatable components of the first content. For a text segment, a unique identifier is generated using a hash code, a checksum or a mathematical algorithm.
  • In step 1510, a plurality of translated components of the second web content are identified or matched using the unique identifier of each of the plurality of translatable components of the first web content. If a translatable component of the first web content is not matched to a translated component of the second web content, in step 1512, the translatable component is designated for translation into the second language. In optional step 1514, the plurality of translatable components that weren't matched are queued for human or machine translation into a second language. In optional step 1516, the plurality of translatable components that weren't matched are provided to WebCATT 408 for translation into a second language. In step 1518, the control flow of FIG. 15 stops.
  • Exemplary Implementations
  • The present invention can be realized in hardware, software, or a combination of hardware and software. A system according to a preferred embodiment of the present invention can be realized in a centralized fashion in one computer system, or in a distributed fashion where different elements are spread across several interconnected computer systems. Any kind of computer system—or other apparatus adapted for carrying out the methods described herein—is suited. A typical combination of hardware and software could be a general-purpose computer system with a computer program that, when being loaded and executed, controls the computer system such that it carries out the methods described herein.
  • An embodiment of the present invention can also be embedded in a computer program product, which comprises all the features enabling the implementation of the methods described herein, and which—when loaded in a computer system—is able to carry out these methods. Computer program means or computer program as used in the present invention indicates any expression, in any language, code or notation, of a set of instructions intended to cause a system having an information processing capability to perform a particular function either directly or after either or both of the following a) conversion to another language, code or, notation; and b) reproduction in a different material form.
  • A computer system may include, inter alia, one or more computers and at least a computer readable medium, allowing a computer system, to read data, instructions, messages or message packets, and other computer readable information from the computer readable medium. The computer readable medium may include non-volatile memory, such as ROM, Flash memory, Disk drive memory, CD-ROM, and other permanent storage. Additionally, a computer readable medium may include, for example, volatile storage such as RAM, buffers, cache memory, and network circuits. Furthermore, the computer readable medium may comprise computer readable information in a transitory state medium such as a network link and/or a network interface, including a wired network or a wireless network, that allow a computer system to read such computer readable information.
  • FIG. 16 is a block diagram of a computer system useful for implementing an embodiment of the present invention. The computer system includes one or more processors, such as processor 1604. The processor 1604 is connected to a communication infrastructure 1602 (e.g., a communications bus, cross-over bar, or network). Various software embodiments are described in terms of this exemplary computer system. After reading this description, it will become apparent to a person of ordinary skill in the relevant art(s) how to implement the invention using other computer systems and/or computer architectures.
  • The computer system can include a display interface 1608 that forwards graphics, text, and other data from the communication infrastructure 1602 (or from a frame buffer not shown) for display on the display unit 1610. The computer system also includes a main memory 1606, preferably random access memory (RAM), and may also include a secondary memory 1612. The secondary memory 1612 may include, for example, a hard disk drive 1614 and/or a removable storage drive 1616, representing a floppy disk drive, a magnetic tape drive, an optical disk drive, etc. The removable storage drive 1616 reads from and/or writes to a removable storage unit 1618 in a manner well known to those having ordinary skill in the art. Removable storage unit 1618, represents a floppy disk, magnetic tape, optical disk, etc. which is read by and written to by removable storage drive 1616. As will be appreciated, the removable storage unit 1618 includes a computer usable storage medium having stored therein computer software and/or data.
  • In alternative embodiments, the secondary memory 1612 may include other similar means for allowing computer programs or other instructions to be loaded into the computer system. Such means may include, for example, a removable storage unit 1622 and an interface 1620. Examples of such may include a program cartridge and cartridge interface (such as that found in video game devices), a removable memory chip (such as an EPROM, or PROM) and associated socket, and other removable storage units 1622 and interfaces 1620 which allow software and data to be transferred from the removable storage unit 1622 to the computer system.
  • The computer system may also include a communications interface 1624. Communications interface 1624 allows software and data to be transferred between the computer system and external devices. Examples of communications interface 1624 may include a modem, a network interface (such as an Ethernet card), a communications port, a PCMCIA slot and card, etc. Software and data transferred via communications interface 1624 are in the form of signals which may be, for example, electronic, electromagnetic, optical, or other signals capable of being received by communications interface 1624. These signals are provided to communications interface 1624 via a communications path (i.e., channel) 1626. This channel 1626 carries signals and may be implemented using wire or cable, fiber optics, a phone line, a cellular phone link, an RF link, and/or other communications channels.
  • In this document, the terms “computer program medium,” “computer usable medium,” and “computer readable medium” are used to generally refer to media such as main memory 1606 and secondary memory 1612, removable storage drive 1616, a hard disk installed in hard disk drive 1614, and signals. These computer program products are means for providing software to the computer system. The computer readable medium allows the computer system to read data, instructions, messages or message packets, and other computer readable information from the computer readable medium. The computer readable medium, for example, may include non-volatile memory, such as Floppy, ROM, Flash memory, Disk drive memory, CD-ROM, and other permanent storage. It is useful, for example, for transporting information, such as data and computer instructions, between computer systems. Furthermore, the computer readable medium may comprise computer readable information in a transitory state medium such as a network link and/or a network interface, including a wired network or a wireless network, that allow a computer to read such computer readable information.
  • Computer programs (also called computer control logic) are stored in mam memory 1606 and/or secondary memory 1612. Computer programs may also be received via communications interface 1624. Such computer programs, when executed, enable the computer system to perform the features of the present invention as discussed herein. In particular, the computer programs, when executed, enable the processor 1604 to perform the features of the computer system. Accordingly, such computer programs represent controllers of the computer system.
  • Although specific embodiments of the invention have been disclosed, those having ordinary skill in the art will understand that changes can be made to the specific embodiments without departing from the spirit and scope of the invention. The scope of the invention is not to be restricted, therefore, to the specific embodiments. Furthermore, it is intended that the appended claims cover any and all such applications, modifications, and embodiments within the scope of the present invention.

Claims (15)

What is claimed is:
1. A method, implemented on a machine having at least one processor, storage, and a communication platform, for providing statistics characterizing translation work in synchronizing content in different languages, comprising:
receiving a request from a user for accessing content hosted on a website in a first language, wherein the user requests to view the content in a second language and at least some of the content in the first language has previously been translated into the second language;
obtaining the content in the first language from the website via a publicly accessible network path based on the request;
parsing the obtained content in the first language into a plurality of translatable components;
accessing a database that stores the content in the second language previously translated as translated components;
identifying at least some of the plurality of translatable components that do not have a corresponding translated component in the database;
generating statistics based on the at least some of the translatable components to estimate the work load involved in translation of the at least some of the translatable components from the first language to the second language; and
providing the statistics to characterize a service related to synchronizing the content in the first and second languages.
2. The method according to claim 1, wherein the translation includes human translating the at least some of the plurality of translatable components.
3. The method according to claim 1, further comprising adding the at least some of the plurality of translatable components to a translation list for translation into the second language.
4. The method according to claim 1, further comprising generating an identifier for each of the plurality of translatable components such that each of the plurality of translatable components is accessible via a corresponding identifier.
5. The method according to claim 4, wherein the identifier for a text segment is generated using at least one of a hash code, a checksum, and a mathematical algorithm based on one or more text segments.
6. The method according to claim 1, wherein the statistics includes at least one of a file count, a page count, a text segment count, a unique text segment count, a word count, and a unique word count.
7. The method of claim 1, wherein the generating comprises:
computing the statistics based on information associated with any of the at least some of the plurality of translatable components that do not have a corresponding translated component in the second language.
8. A machine readable non-transitory medium having information stored thereon for providing statistics characterizing translation work in synchronizing content in different languages, wherein the information, when read, causes the machine to perform the following:
receiving a request from a user for accessing content hosted on a website in a first language, wherein the user requests to view the content in a second language and at least some of the content in the first language has previously been translated into the second language;
obtaining the content in the first language from the website via a publicly accessible network path based on the request;
parsing the obtained content in the first language into a plurality of translatable components;
accessing a database that stores the content in the second language previously translated as translated components;
identifying at least some of the plurality of translatable components that do not have a corresponding translated component in the database;
generating statistics based on the at least some of the translatable components to estimate the work load involved in translation of the at least some of the translatable components from the first language to the second language; and
providing the statistics to characterize a service related to synchronizing the content in the first and second languages.
9. The medium according to claim 8, wherein the translation includes human translating the at least some of the plurality of translatable components.
10. The medium according to claim 8, wherein the information, when read, further causes the machine to perform the following: adding the at least some of the plurality of translatable components to a translation list for translation into the second language.
11. The medium according to claim 8, wherein the information, when read, further causes the machine to perform the following: generating an identifier for each of the plurality of translatable components such that each of the plurality of translatable components is accessible via a corresponding identifier.
12. The medium according to claim 11, wherein the identifier for a text segment is generated using at least one of a hash code, a checksum, and a mathematical algorithm based on one or more text segments.
13. The medium according to claim 8, wherein the statistics includes at least one of a file count, a page count, a text segment count, a unique text segment count, a word count, and a unique word count.
14. The medium according to claim 8, wherein the generating comprises:
computing the statistics based on information associated with any of the at least some of the plurality of translatable components that do not have a corresponding translated component in the second language.
15. A system having at least one processor, storage, and a communication platform, for providing statistics characterizing translation work in synchronizing content in different languages, wherein the at least one processor is configured for:
receiving a request from a user for accessing content hosted on a website in a first language, wherein the user requests to view the content in a second language and at least some of the content in the first language has previously been translated into the second language;
obtaining the content in the first language from the website via a publicly accessible network path based on the request;
parsing the obtained content in the first language into a plurality of translatable components;
accessing a database that stores the content in the second language previously translated as translated components;
identifying at least some of the plurality of translatable components that do not have a corresponding translated component in the database;
generating statistics based on the at least some of the translatable components to estimate the work load involved in translation of the at least some of the translatable components from the first language to the second language; and
providing the statistics to characterize a service related to synchronizing the content in the first and second languages.
US15/447,289 2003-02-21 2017-03-02 Analyzing Web Site for Translation Abandoned US20170177567A1 (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
US15/447,289 US20170177567A1 (en) 2003-02-21 2017-03-02 Analyzing Web Site for Translation

Applications Claiming Priority (5)

Application Number Priority Date Filing Date Title
US44957103P 2003-02-21 2003-02-21
US10/784,334 US7627817B2 (en) 2003-02-21 2004-02-23 Analyzing web site for translation
US12/609,834 US8566710B2 (en) 2003-02-21 2009-10-30 Analyzing web site for translation
US13/933,815 US9626360B2 (en) 2003-02-21 2013-07-02 Analyzing web site for translation
US15/447,289 US20170177567A1 (en) 2003-02-21 2017-03-02 Analyzing Web Site for Translation

Related Parent Applications (1)

Application Number Title Priority Date Filing Date
US13/933,815 Continuation US9626360B2 (en) 2003-02-21 2013-07-02 Analyzing web site for translation

Publications (1)

Publication Number Publication Date
US20170177567A1 true US20170177567A1 (en) 2017-06-22

Family

ID=32872168

Family Applications (17)

Application Number Title Priority Date Filing Date
US10/784,334 Active 2025-01-11 US7627817B2 (en) 2003-02-21 2004-02-23 Analyzing web site for translation
US10/784,726 Active 2026-11-04 US7627479B2 (en) 2003-02-21 2004-02-23 Automation tool for web site content language translation
US10/784,868 Active 2024-12-11 US7580960B2 (en) 2003-02-21 2004-02-23 Synchronization of web site content between languages
US10/784,727 Active 2025-09-08 US7584216B2 (en) 2003-02-21 2004-02-23 Dynamic language translation of web site content
US12/507,344 Expired - Lifetime US7996417B2 (en) 2003-02-21 2009-07-22 Dynamic language translation of web site content
US12/508,198 Expired - Lifetime US8065294B2 (en) 2003-02-21 2009-07-23 Synchronization of web site content between languages
US12/609,778 Expired - Lifetime US10409918B2 (en) 2003-02-21 2009-10-30 Automation tool for web site content language translation
US12/609,834 Active 2026-07-07 US8566710B2 (en) 2003-02-21 2009-10-30 Analyzing web site for translation
US13/096,464 Expired - Lifetime US8433718B2 (en) 2003-02-21 2011-04-28 Dynamic language translation of web site content
US13/742,211 Expired - Lifetime US8949223B2 (en) 2003-02-21 2013-01-15 Dynamic language translation of web site content
US13/933,815 Active 2026-07-03 US9626360B2 (en) 2003-02-21 2013-07-02 Analyzing web site for translation
US14/573,453 Expired - Lifetime US9367540B2 (en) 2003-02-21 2014-12-17 Dynamic language translation of web site content
US15/159,940 Expired - Lifetime US9652455B2 (en) 2003-02-21 2016-05-20 Dynamic language translation of web site content
US15/447,289 Abandoned US20170177567A1 (en) 2003-02-21 2017-03-02 Analyzing Web Site for Translation
US15/482,927 Expired - Lifetime US9910853B2 (en) 2003-02-21 2017-04-10 Dynamic language translation of web site content
US15/876,655 Expired - Fee Related US10621287B2 (en) 2003-02-21 2018-01-22 Dynamic language translation of web site content
US16/513,794 Expired - Lifetime US11308288B2 (en) 2003-02-21 2019-07-17 Automation tool for web site content language translation

Family Applications Before (13)

Application Number Title Priority Date Filing Date
US10/784,334 Active 2025-01-11 US7627817B2 (en) 2003-02-21 2004-02-23 Analyzing web site for translation
US10/784,726 Active 2026-11-04 US7627479B2 (en) 2003-02-21 2004-02-23 Automation tool for web site content language translation
US10/784,868 Active 2024-12-11 US7580960B2 (en) 2003-02-21 2004-02-23 Synchronization of web site content between languages
US10/784,727 Active 2025-09-08 US7584216B2 (en) 2003-02-21 2004-02-23 Dynamic language translation of web site content
US12/507,344 Expired - Lifetime US7996417B2 (en) 2003-02-21 2009-07-22 Dynamic language translation of web site content
US12/508,198 Expired - Lifetime US8065294B2 (en) 2003-02-21 2009-07-23 Synchronization of web site content between languages
US12/609,778 Expired - Lifetime US10409918B2 (en) 2003-02-21 2009-10-30 Automation tool for web site content language translation
US12/609,834 Active 2026-07-07 US8566710B2 (en) 2003-02-21 2009-10-30 Analyzing web site for translation
US13/096,464 Expired - Lifetime US8433718B2 (en) 2003-02-21 2011-04-28 Dynamic language translation of web site content
US13/742,211 Expired - Lifetime US8949223B2 (en) 2003-02-21 2013-01-15 Dynamic language translation of web site content
US13/933,815 Active 2026-07-03 US9626360B2 (en) 2003-02-21 2013-07-02 Analyzing web site for translation
US14/573,453 Expired - Lifetime US9367540B2 (en) 2003-02-21 2014-12-17 Dynamic language translation of web site content
US15/159,940 Expired - Lifetime US9652455B2 (en) 2003-02-21 2016-05-20 Dynamic language translation of web site content

Family Applications After (3)

Application Number Title Priority Date Filing Date
US15/482,927 Expired - Lifetime US9910853B2 (en) 2003-02-21 2017-04-10 Dynamic language translation of web site content
US15/876,655 Expired - Fee Related US10621287B2 (en) 2003-02-21 2018-01-22 Dynamic language translation of web site content
US16/513,794 Expired - Lifetime US11308288B2 (en) 2003-02-21 2019-07-17 Automation tool for web site content language translation

Country Status (1)

Country Link
US (17) US7627817B2 (en)

Families Citing this family (330)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2000057320A2 (en) 1999-03-19 2000-09-28 Trados Gmbh Workflow management system
US20060116865A1 (en) 1999-09-17 2006-06-01 Www.Uniscape.Com E-services translation utilizing machine translation and translation memory
US8214196B2 (en) 2001-07-03 2012-07-03 University Of Southern California Syntax-based statistical translation model
WO2004001623A2 (en) 2002-03-26 2003-12-31 University Of Southern California Constructing a translation lexicon from comparable, non-parallel corpora
US20040024585A1 (en) * 2002-07-03 2004-02-05 Amit Srivastava Linguistic segmentation of speech
US20040004599A1 (en) * 2002-07-03 2004-01-08 Scott Shepard Systems and methods for facilitating playback of media
US7627817B2 (en) 2003-02-21 2009-12-01 Motionpoint Corporation Analyzing web site for translation
US8230112B2 (en) * 2003-03-27 2012-07-24 Siebel Systems, Inc. Dynamic support of multiple message formats
US7444590B2 (en) * 2003-06-25 2008-10-28 Microsoft Corporation Systems and methods for declarative localization of web services
CA2433512C (en) * 2003-06-26 2008-01-15 Ibm Canada Limited - Ibm Canada Limitee File translation
US8548794B2 (en) 2003-07-02 2013-10-01 University Of Southern California Statistical noun phrase translation
US7321852B2 (en) * 2003-10-28 2008-01-22 International Business Machines Corporation System and method for transcribing audio files of various languages
US8566081B2 (en) * 2004-03-25 2013-10-22 Stanley F. Schoenbach Method and system providing interpreting and other services from a remote location
US7293012B1 (en) * 2003-12-19 2007-11-06 Microsoft Corporation Friendly URLs
US20100262621A1 (en) * 2004-03-05 2010-10-14 Russ Ross In-context exact (ice) matching
US7983896B2 (en) * 2004-03-05 2011-07-19 SDL Language Technology In-context exact (ICE) matching
US8296127B2 (en) 2004-03-23 2012-10-23 University Of Southern California Discovery of parallel text portions in comparable collections of corpora and training using comparable texts
US20050223317A1 (en) * 2004-03-31 2005-10-06 Byrer Loralie A Content management system
US8666725B2 (en) 2004-04-16 2014-03-04 University Of Southern California Selection and use of nonstatistical translation components in a statistical machine translation framework
US7707265B2 (en) * 2004-05-15 2010-04-27 International Business Machines Corporation System, method, and service for interactively presenting a summary of a web site
US7437364B1 (en) 2004-06-30 2008-10-14 Google Inc. System and method of accessing a document efficiently through multi-tier web caching
US8224964B1 (en) 2004-06-30 2012-07-17 Google Inc. System and method of accessing a document efficiently through multi-tier web caching
US8676922B1 (en) 2004-06-30 2014-03-18 Google Inc. Automatic proxy setting modification
JP2006025127A (en) * 2004-07-07 2006-01-26 Canon Inc Image processor and control method thereof
JP2006033085A (en) * 2004-07-12 2006-02-02 Canon Inc Image processing apparatus and control method thereof
US20060070022A1 (en) * 2004-09-29 2006-03-30 International Business Machines Corporation URL mapping with shadow page support
JP2006099296A (en) * 2004-09-29 2006-04-13 Nec Corp Translation system, translation communication system, machine translation method and program
DE112005002534T5 (en) 2004-10-12 2007-11-08 University Of Southern California, Los Angeles Training for a text-to-text application that uses a string-tree transformation for training and decoding
US7669198B2 (en) * 2004-11-18 2010-02-23 International Business Machines Corporation On-demand translator for localized operating systems
US7624092B2 (en) * 2004-11-19 2009-11-24 Sap Aktiengesellschaft Concept-based content architecture
US7716641B2 (en) * 2004-12-01 2010-05-11 Microsoft Corporation Method and system for automatically identifying and marking subsets of localizable resources
US7617092B2 (en) 2004-12-01 2009-11-10 Microsoft Corporation Safe, secure resource editing for application localization
US20060116864A1 (en) * 2004-12-01 2006-06-01 Microsoft Corporation Safe, secure resource editing for application localization with automatic adjustment of application user interface for translated resources
US20060122872A1 (en) * 2004-12-06 2006-06-08 Stevens Harold L Graphical user interface for and method of use for a computer-implemented system and method for booking travel itineraries
WO2006068645A1 (en) * 2004-12-21 2006-06-29 Kunz Linda H Multicultural and multimedia data collection and documentation computer system, apparatus and method
FR2880716A1 (en) * 2005-01-13 2006-07-14 Gemplus Sa CUSTOMIZATION OF SERVICE IN A TERMINAL DEVICE
US7536640B2 (en) * 2005-01-28 2009-05-19 Oracle International Corporation Advanced translation context via web pages embedded with resource information
EP1693830B1 (en) * 2005-02-21 2017-12-20 Harman Becker Automotive Systems GmbH Voice-controlled data system
US20060206797A1 (en) * 2005-03-08 2006-09-14 Microsoft Corporation Authorizing implementing application localization rules
US8219907B2 (en) * 2005-03-08 2012-07-10 Microsoft Corporation Resource authoring with re-usability score and suggested re-usable data
JP2006277103A (en) * 2005-03-28 2006-10-12 Fuji Xerox Co Ltd Document translating method and its device
WO2006116676A2 (en) * 2005-04-28 2006-11-02 Wms Gaming Inc. Wagering game device having ubiquitous character set
EP1722307A1 (en) * 2005-05-09 2006-11-15 Amadeus s.a.s Dynamic method for XML documents generation from a database
US7958446B2 (en) * 2005-05-17 2011-06-07 Yahoo! Inc. Systems and methods for language translation in network browsing applications
US9582602B2 (en) 2005-05-17 2017-02-28 Excalibur Ip, Llc Systems and methods for improving access to syndication feeds in network browsing applications
US20070174286A1 (en) * 2005-05-17 2007-07-26 Yahoo!, Inc. Systems and methods for providing features and user interface in network browsing applications
US7882116B2 (en) * 2005-05-18 2011-02-01 International Business Machines Corporation Method for localization of programming modeling resources
US7640255B2 (en) 2005-05-31 2009-12-29 Sap, Ag Method for utilizing a multi-layered data model to generate audience specific documents
US7657511B2 (en) * 2005-05-31 2010-02-02 Sap, Ag Multi-layered data model for generating audience-specific documents
US8886517B2 (en) 2005-06-17 2014-11-11 Language Weaver, Inc. Trust scoring for language translation systems
US8676563B2 (en) 2009-10-01 2014-03-18 Language Weaver, Inc. Providing human-generated and machine-generated trusted translations
JP2007011733A (en) * 2005-06-30 2007-01-18 Dynacomware Taiwan Inc Method, device and system for preparing asian web font document
US7668904B2 (en) * 2005-07-28 2010-02-23 International Business Machines Corporation Session replication
US8126702B2 (en) * 2005-08-01 2012-02-28 Sap Ag Translating data objects
US8700383B2 (en) * 2005-08-25 2014-04-15 Multiling Corporation Translation quality quantifying apparatus and method
US10319252B2 (en) 2005-11-09 2019-06-11 Sdl Inc. Language capability assessment and training apparatus and techniques
US8392872B2 (en) * 2005-11-19 2013-03-05 International Business Machines Corporation Pseudo translation within integrated development environment
US7822596B2 (en) * 2005-12-05 2010-10-26 Microsoft Corporation Flexible display translation
US7792843B2 (en) * 2005-12-21 2010-09-07 Adobe Systems Incorporated Web analytics data ranking and audio presentation
US7814425B1 (en) 2005-12-30 2010-10-12 Aol Inc. Thumbnail image previews
CN101361065B (en) * 2006-02-17 2013-04-10 谷歌公司 Encoding and adaptive, scalable accessing of distributed models
JP4782591B2 (en) * 2006-03-10 2011-09-28 富士通セミコンダクター株式会社 Reconfigurable circuit
ES2493921T3 (en) 2006-03-14 2014-09-12 University Of Southern California MEMS device for the administration of therapeutic agents
US7881923B2 (en) * 2006-03-31 2011-02-01 Research In Motion Limited Handheld electronic device including toggle of a selected data source, and associated method
US8943080B2 (en) * 2006-04-07 2015-01-27 University Of Southern California Systems and methods for identifying parallel documents and sentence fragments in multilingual document collections
US8209162B2 (en) * 2006-05-01 2012-06-26 Microsoft Corporation Machine translation split between front end and back end processors
US7747749B1 (en) * 2006-05-05 2010-06-29 Google Inc. Systems and methods of efficiently preloading documents to client devices
US8924194B2 (en) 2006-06-20 2014-12-30 At&T Intellectual Property Ii, L.P. Automatic translation of advertisements
US8009566B2 (en) 2006-06-26 2011-08-30 Palo Alto Networks, Inc. Packet classification in a network security device
US8584005B1 (en) * 2006-06-28 2013-11-12 Adobe Systems Incorporated Previewing redaction content in a document
US8886518B1 (en) 2006-08-07 2014-11-11 Language Weaver, Inc. System and method for capitalizing machine translated text
US20080033711A1 (en) * 2006-08-07 2008-02-07 Atkin Steven E Method and system for testing translatability of non-textual resources
US8249855B2 (en) * 2006-08-07 2012-08-21 Microsoft Corporation Identifying parallel bilingual data over a network
US20080040094A1 (en) * 2006-08-08 2008-02-14 Employease, Inc. Proxy For Real Time Translation of Source Objects Between A Server And A Client
US8521506B2 (en) * 2006-09-21 2013-08-27 Sdl Plc Computer-implemented method, computer software and apparatus for use in a translation system
US7984430B2 (en) * 2006-09-27 2011-07-19 Electronics And Telecommunications Research Institute Parser framework using markup language
US7742833B1 (en) 2006-09-28 2010-06-22 Rockwell Automation Technologies, Inc. Auto discovery of embedded historians in network
US7672740B1 (en) 2006-09-28 2010-03-02 Rockwell Automation Technologies, Inc. Conditional download of data from embedded historians
US7913228B2 (en) * 2006-09-29 2011-03-22 Rockwell Automation Technologies, Inc. Translation viewer for project documentation and editing
US8181157B2 (en) 2006-09-29 2012-05-15 Rockwell Automation Technologies, Inc. Custom language support for project documentation and editing
US20080086310A1 (en) * 2006-10-09 2008-04-10 Kent Campbell Automated Contextually Specific Audio File Generator
DE102006051092B4 (en) * 2006-10-25 2008-11-27 Sirvaluse Consulting Gmbh Computer-aided method for remote recording of user behavior when receiving web pages
US8433556B2 (en) 2006-11-02 2013-04-30 University Of Southern California Semi-supervised training for statistical word alignment
US20080115072A1 (en) * 2006-11-09 2008-05-15 International Business Machines Corporation Method and apparatus for visually assisting language input mode indentification
US7933666B2 (en) 2006-11-10 2011-04-26 Rockwell Automation Technologies, Inc. Adjustable data collection rate for embedded historians
US20080114474A1 (en) * 2006-11-10 2008-05-15 Rockwell Automation Technologies, Inc. Event triggered data capture via embedded historians
WO2008070877A2 (en) * 2006-12-08 2008-06-12 Hall Patrick J Online computer-aided translation
US9122674B1 (en) 2006-12-15 2015-09-01 Language Weaver, Inc. Use of annotations in statistical machine translation
US20080168049A1 (en) * 2007-01-08 2008-07-10 Microsoft Corporation Automatic acquisition of a parallel corpus from a network
US8468149B1 (en) * 2007-01-26 2013-06-18 Language Weaver, Inc. Multi-lingual online community
US8812651B1 (en) 2007-02-15 2014-08-19 Google Inc. Systems and methods for client cache awareness
US8065275B2 (en) 2007-02-15 2011-11-22 Google Inc. Systems and methods for cache optimization
JP4979414B2 (en) 2007-02-28 2012-07-18 インターナショナル・ビジネス・マシーンズ・コーポレーション Management server, computer program, and method for provisioning in a multi-locale mixed environment
US20080225808A1 (en) * 2007-03-14 2008-09-18 Jorge Eduardo Springmuhl Samayoa Integrated media system and method
US7895030B2 (en) * 2007-03-16 2011-02-22 International Business Machines Corporation Visualization method for machine translation
US8615389B1 (en) 2007-03-16 2013-12-24 Language Weaver, Inc. Generation and exploitation of an approximate language model
US8831928B2 (en) * 2007-04-04 2014-09-09 Language Weaver, Inc. Customizable machine translation service
US20080256480A1 (en) * 2007-04-06 2008-10-16 Sbs Information Systems Co., Ltd. Data gathering and processing system
US20080288239A1 (en) * 2007-05-15 2008-11-20 Microsoft Corporation Localization and internationalization of document resources
US7974937B2 (en) 2007-05-17 2011-07-05 Rockwell Automation Technologies, Inc. Adaptive embedded historians with aggregator component
US9361294B2 (en) 2007-05-31 2016-06-07 Red Hat, Inc. Publishing tool for translating documents
US10296588B2 (en) * 2007-05-31 2019-05-21 Red Hat, Inc. Build of material production system
US8205151B2 (en) * 2007-05-31 2012-06-19 Red Hat, Inc. Syndication of documents in increments
US8825466B1 (en) 2007-06-08 2014-09-02 Language Weaver, Inc. Modification of annotated bilingual segment pairs in syntax-based machine translation
US20080312902A1 (en) * 2007-06-18 2008-12-18 Russell Kenneth Dollinger Interlanguage communication with verification
CN101364970B (en) * 2007-08-09 2012-06-20 鸿富锦精密工业(深圳)有限公司 Webpage material download control system and method
US7930261B2 (en) * 2007-09-26 2011-04-19 Rockwell Automation Technologies, Inc. Historians embedded in industrial units
US7930639B2 (en) * 2007-09-26 2011-04-19 Rockwell Automation Technologies, Inc. Contextualization for historians in industrial systems
US7917857B2 (en) * 2007-09-26 2011-03-29 Rockwell Automation Technologies, Inc. Direct subscription to intelligent I/O module
KR20230156158A (en) * 2007-09-26 2023-11-13 에이큐 미디어 인크 Audio-visual navigation and communication
US7882218B2 (en) * 2007-09-27 2011-02-01 Rockwell Automation Technologies, Inc. Platform independent historian
US7962440B2 (en) * 2007-09-27 2011-06-14 Rockwell Automation Technologies, Inc. Adaptive industrial systems via embedded historian data
US7809656B2 (en) * 2007-09-27 2010-10-05 Rockwell Automation Technologies, Inc. Microhistorians as proxies for data transfer
US8086440B2 (en) * 2007-10-16 2011-12-27 Lockheed Martin Corporation System and method of prioritizing automated translation of communications from a first human language to a second human language
US8645120B2 (en) * 2007-10-16 2014-02-04 Lockheed Martin Corporation System and method of prioritizing automated translation of communications from a first human language to a second human language
US20090119091A1 (en) * 2007-11-01 2009-05-07 Eitan Chaim Sarig Automated pattern based human assisted computerized translation network systems
US8572065B2 (en) * 2007-11-09 2013-10-29 Microsoft Corporation Link discovery from web scripts
US9122650B1 (en) 2007-11-14 2015-09-01 Appcelerator, Inc. Web server based on the same paradigms as web clients
US8914774B1 (en) 2007-11-15 2014-12-16 Appcelerator, Inc. System and method for tagging code to determine where the code runs
US8954989B1 (en) 2007-11-19 2015-02-10 Appcelerator, Inc. Flexible, event-driven JavaScript server architecture
US8260845B1 (en) 2007-11-21 2012-09-04 Appcelerator, Inc. System and method for auto-generating JavaScript proxies and meta-proxies
US8566807B1 (en) 2007-11-23 2013-10-22 Appcelerator, Inc. System and method for accessibility of document object model and JavaScript by other platforms
US8719451B1 (en) 2007-11-23 2014-05-06 Appcelerator, Inc. System and method for on-the-fly, post-processing document object model manipulation
JP5102593B2 (en) * 2007-11-30 2012-12-19 インターナショナル・ビジネス・マシーンズ・コーポレーション Apparatus and method for controlling display of document data
US8806431B1 (en) 2007-12-03 2014-08-12 Appecelerator, Inc. Aspect oriented programming
US8756579B1 (en) 2007-12-03 2014-06-17 Appcelerator, Inc. Client-side and server-side unified validation
US8819539B1 (en) 2007-12-03 2014-08-26 Appcelerator, Inc. On-the-fly rewriting of uniform resource locators in a web-page
US8140969B2 (en) * 2007-12-03 2012-03-20 International Business Machines Corporation Displaying synchronously documents to a user
US8527860B1 (en) 2007-12-04 2013-09-03 Appcelerator, Inc. System and method for exposing the dynamic web server-side
US8938491B1 (en) 2007-12-04 2015-01-20 Appcelerator, Inc. System and method for secure binding of client calls and server functions
US8639743B1 (en) 2007-12-05 2014-01-28 Appcelerator, Inc. System and method for on-the-fly rewriting of JavaScript
US8285813B1 (en) 2007-12-05 2012-10-09 Appcelerator, Inc. System and method for emulating different user agents on a server
US8335982B1 (en) 2007-12-05 2012-12-18 Appcelerator, Inc. System and method for binding a document object model through JavaScript callbacks
US8185606B2 (en) * 2007-12-12 2012-05-22 International Business Machines Corporation Email change tracking
US7974832B2 (en) * 2007-12-12 2011-07-05 Microsoft Corporation Web translation provider
US9418061B2 (en) * 2007-12-14 2016-08-16 International Business Machines Corporation Prioritized incremental asynchronous machine translation of structured documents
MX2010006840A (en) 2007-12-20 2010-08-12 Univ Southern California Apparatus and methods for delivering therapeutic agents.
JP2009176144A (en) * 2008-01-25 2009-08-06 Access Co Ltd System, apparatus, method and program for converting markup language document
US9201870B2 (en) * 2008-01-25 2015-12-01 First Data Corporation Method and system for providing translated dynamic web page content
TW200933398A (en) * 2008-01-28 2009-08-01 Inventec Corp Method of accessing files with XML documents of Windows formation under Linux
US20090234633A1 (en) * 2008-03-17 2009-09-17 Virginia Chao-Suren Systems and methods for enabling inter-language communications
US8910110B2 (en) * 2008-03-19 2014-12-09 Oracle International Corporation Application translation cost estimator
US9333297B2 (en) 2008-05-08 2016-05-10 Minipumps, Llc Drug-delivery pump with intelligent control
ES2534864T3 (en) 2008-05-08 2015-04-29 Minipumps, Llc Implantable pumps and cannulas for them
US8231609B2 (en) 2008-05-08 2012-07-31 Minipumps, Llc Drug-delivery pumps and methods of manufacture
US8250083B2 (en) * 2008-05-16 2012-08-21 Enpulz, Llc Support for international search terms—translate as you crawl
US9304785B2 (en) * 2008-06-02 2016-04-05 International Business Machines Corporation Localizing a software product
US8291079B1 (en) 2008-06-04 2012-10-16 Appcelerator, Inc. System and method for developing, deploying, managing and monitoring a web application in a single environment
US8880678B1 (en) 2008-06-05 2014-11-04 Appcelerator, Inc. System and method for managing and monitoring a web application using multiple cloud providers
CN101615181B (en) * 2008-06-27 2012-05-16 国际商业机器公司 System and method for establishing internationalized network application
US20090327466A1 (en) * 2008-06-27 2009-12-31 Microsoft Corporation Internal uniform resource locator formulation and testing
US20100017293A1 (en) * 2008-07-17 2010-01-21 Language Weaver, Inc. System, method, and computer program for providing multilingual text advertisments
US9262409B2 (en) * 2008-08-06 2016-02-16 Abbyy Infopoisk Llc Translation of a selected text fragment of a screen
US7596620B1 (en) 2008-11-04 2009-09-29 Aptana, Inc. System and method for developing, deploying, managing and monitoring a web application in a single environment
US20100050112A1 (en) * 2008-08-22 2010-02-25 Inventec Corporation System and method of immediate translation display
US9824071B2 (en) * 2008-12-03 2017-11-21 Microsoft Technology Licensing, Llc Viewing messages and message attachments in different languages
US8873556B1 (en) 2008-12-24 2014-10-28 Palo Alto Networks, Inc. Application based packet forwarding
GB2468278A (en) * 2009-03-02 2010-09-08 Sdl Plc Computer assisted natural language translation outputs selectable target text associated in bilingual corpus with input target text from partial translation
US9262403B2 (en) 2009-03-02 2016-02-16 Sdl Plc Dynamic generation of auto-suggest dictionary for natural language translation
KR101642449B1 (en) * 2009-03-18 2016-07-25 구글 인코포레이티드 Web translation with display replacement
US10671698B2 (en) 2009-05-26 2020-06-02 Microsoft Technology Licensing, Llc Language translation using embeddable component
US9405745B2 (en) * 2009-06-01 2016-08-02 Microsoft Technology Licensing, Llc Language translation using embeddable component
US8312390B2 (en) 2009-06-10 2012-11-13 Microsoft Corporation Dynamic screentip language translation
US8990064B2 (en) 2009-07-28 2015-03-24 Language Weaver, Inc. Translating documents based on content
WO2011020072A1 (en) 2009-08-14 2011-02-17 Stephen Allyn Joyce Data encoding method
CN102576385B (en) 2009-08-18 2016-02-24 迷你泵有限责任公司 There is the electrolytic drug discharge pump of adaptive control
US8380486B2 (en) 2009-10-01 2013-02-19 Language Weaver, Inc. Providing machine-generated translations and corresponding trust levels
US20120246557A1 (en) * 2009-10-12 2012-09-27 Hcl Technologies Limited System and method for transcoding web content adaptable to multiple client devices
US9135349B2 (en) * 2010-01-12 2015-09-15 Maverick Multimedia, Inc. Automatic technical language extension engine
US8566078B2 (en) * 2010-01-29 2013-10-22 International Business Machines Corporation Game based method for translation data acquisition and evaluation
US20110196854A1 (en) * 2010-02-05 2011-08-11 Sarkar Zainul A Providing a www access to a web page
US9152484B2 (en) 2010-02-26 2015-10-06 Red Hat, Inc. Generating predictive diagnostics via package update manager
US10534624B2 (en) * 2010-02-26 2020-01-14 Red Hat, Inc. Generating and storing translation information as package metadata
US10417646B2 (en) 2010-03-09 2019-09-17 Sdl Inc. Predicting the cost associated with translating textual content
US9244913B2 (en) * 2010-03-19 2016-01-26 Verizon Patent And Licensing Inc. Multi-language closed captioning
US20110282647A1 (en) * 2010-05-12 2011-11-17 IQTRANSLATE.COM S.r.l. Translation System and Method
EP2572299A1 (en) * 2010-05-17 2013-03-27 Green SQL Ltd Database translation system and method
US9767095B2 (en) * 2010-05-21 2017-09-19 Western Standard Publishing Company, Inc. Apparatus, system, and method for computer aided translation
US20110289424A1 (en) * 2010-05-21 2011-11-24 Microsoft Corporation Secure application of custom resources in multi-tier systems
US8327261B2 (en) * 2010-06-08 2012-12-04 Oracle International Corporation Multilingual tagging of content with conditional display of unilingual tags
US9213685B2 (en) 2010-07-13 2015-12-15 Motionpoint Corporation Dynamic language translation of web site content
US20120022851A1 (en) * 2010-07-23 2012-01-26 International Business Machines Corporation On-demand translation of application text
US8473911B1 (en) * 2010-07-23 2013-06-25 Xilinx, Inc. Documentation generation from a computer readable symbolic representation
CN102467497B (en) * 2010-10-29 2014-11-05 国际商业机器公司 Method and system for text translation in verification program
US9645722B1 (en) * 2010-11-19 2017-05-09 A9.Com, Inc. Preview search results
US9864611B2 (en) * 2010-12-15 2018-01-09 Microsoft Technology Licensing, Llc Extensible template pipeline for web applications
US9679404B2 (en) 2010-12-23 2017-06-13 Microsoft Technology Licensing, Llc Techniques for dynamic layout of presentation tiles on a grid
US9436685B2 (en) 2010-12-23 2016-09-06 Microsoft Technology Licensing, Llc Techniques for electronic aggregation of information
US20120166953A1 (en) * 2010-12-23 2012-06-28 Microsoft Corporation Techniques for electronic aggregation of information
US20120166526A1 (en) * 2010-12-27 2012-06-28 Amit Ashok Ambardekar Request forwarding and result aggregating systems, methods and computer readable media
US9164988B2 (en) * 2011-01-14 2015-10-20 Lionbridge Technologies, Inc. Methods and systems for the dynamic creation of a translated website
US9128929B2 (en) 2011-01-14 2015-09-08 Sdl Language Technologies Systems and methods for automatically estimating a translation time including preparation time in addition to the translation itself
ES2634669T3 (en) 2011-02-08 2017-09-28 Halozyme, Inc. Composition and lipid formulation of a hyaluronan degradation enzyme and use thereof for the treatment of benign prostatic hyperplasia
NL2006294C2 (en) * 2011-02-24 2012-08-27 Exvo Com Group B V Website translator, system, and method.
US10140320B2 (en) 2011-02-28 2018-11-27 Sdl Inc. Systems, methods, and media for generating analytical data
US20120221319A1 (en) * 2011-02-28 2012-08-30 Andrew Trese Systems, Methods and Media for Translating Informational Content
JP5908213B2 (en) * 2011-02-28 2016-04-26 ブラザー工業株式会社 Communication device
US8527259B1 (en) * 2011-02-28 2013-09-03 Google Inc. Contextual translation of digital content
US8291311B2 (en) * 2011-03-07 2012-10-16 Showcase-TV Inc. Web display program conversion system, web display program conversion method and program for converting web display program
US20120246561A1 (en) * 2011-03-22 2012-09-27 Toby Doig Systems and methods for extended content harvesting for contextualizing
US9183199B2 (en) * 2011-03-25 2015-11-10 Ming-Yuan Wu Communication device for multiple language translation system
US9715485B2 (en) 2011-03-28 2017-07-25 Microsoft Technology Licensing, Llc Techniques for electronic aggregation of information
US11003838B2 (en) 2011-04-18 2021-05-11 Sdl Inc. Systems and methods for monitoring post translation editing
US20120271823A1 (en) * 2011-04-25 2012-10-25 Rovi Technologies Corporation Automated discovery of content and metadata
MY159469A (en) * 2011-04-28 2017-01-13 Rakuten Inc Browsing System, Terminal, Image Server, Computer-Readable Recording Medium Recording Program, and Method
US8538742B2 (en) * 2011-05-20 2013-09-17 Google Inc. Feed translation for a social network
US9047441B2 (en) 2011-05-24 2015-06-02 Palo Alto Networks, Inc. Malware analysis system
US8695096B1 (en) * 2011-05-24 2014-04-08 Palo Alto Networks, Inc. Automatic signature generation for malicious PDF files
US20120323707A1 (en) * 2011-06-14 2012-12-20 Urban Translations, LLC. Multi-Language Electronic Menu System and Method
US8694303B2 (en) 2011-06-15 2014-04-08 Language Weaver, Inc. Systems and methods for tuning parameters in statistical machine translation
WO2012174703A1 (en) 2011-06-20 2012-12-27 Microsoft Corporation Hover translation of search result captions
US20120330644A1 (en) * 2011-06-22 2012-12-27 Salesforce.Com Inc. Multi-lingual knowledge base
US20130014084A1 (en) * 2011-07-05 2013-01-10 Microsoft Corporation International Testing Platform
US9984054B2 (en) 2011-08-24 2018-05-29 Sdl Inc. Web interface including the review and manipulation of a web document and utilizing permission based control
US8954315B2 (en) * 2011-10-10 2015-02-10 Ca, Inc. System and method for mixed-language support for applications
US9195648B2 (en) * 2011-10-12 2015-11-24 Salesforce.Com, Inc. Multi-lingual knowledge base
US8886515B2 (en) 2011-10-19 2014-11-11 Language Weaver, Inc. Systems and methods for enhancing machine translation post edit review processes
US9195653B2 (en) * 2011-10-24 2015-11-24 Google Inc. Identification of in-context resources that are not fully localized
US9645989B2 (en) * 2011-11-04 2017-05-09 Sas Institute Inc. Techniques to generate custom electronic forms using custom content
US8712761B2 (en) * 2011-11-22 2014-04-29 Google Inc. Techniques for performing translation of messages
US9081769B2 (en) * 2011-11-25 2015-07-14 Google Inc. Providing translation assistance in application localization
CN104205093B (en) * 2012-02-03 2018-04-20 谷歌有限责任公司 Translated news
US9213695B2 (en) 2012-02-06 2015-12-15 Language Line Services, Inc. Bridge from machine language interpretation to human language interpretation
US9658998B2 (en) * 2012-02-24 2017-05-23 American Express Travel Related Services Company, Inc. Systems and methods for internationalization and localization
US9251223B2 (en) * 2012-02-29 2016-02-02 Google Inc. Alternative web pages suggestion based on language
US8942973B2 (en) 2012-03-09 2015-01-27 Language Weaver, Inc. Content page URL translation
US9418060B1 (en) * 2012-03-19 2016-08-16 Amazon Technologies, Inc. Sample translation reviews
US9213693B2 (en) 2012-04-03 2015-12-15 Language Line Services, Inc. Machine language interpretation assistance for human language interpretation
US10261994B2 (en) 2012-05-25 2019-04-16 Sdl Inc. Method and system for automatic management of reputation of translators
CN103488648B (en) 2012-06-13 2018-03-20 阿里巴巴集团控股有限公司 A kind of multilingual mixed index method and system
US20140006004A1 (en) * 2012-07-02 2014-01-02 Microsoft Corporation Generating localized user interfaces
US20140039871A1 (en) * 2012-08-02 2014-02-06 Richard Henry Dana Crawford Synchronous Texts
WO2014027237A1 (en) * 2012-08-12 2014-02-20 Bablic Ltd. Systems and methods for web localization
US20140081618A1 (en) * 2012-09-17 2014-03-20 Salesforce.Com, Inc. Designing a website to be displayed in multiple languages
US9400848B2 (en) * 2012-09-26 2016-07-26 Google Inc. Techniques for context-based grouping of messages for translation
US9736210B2 (en) * 2012-10-01 2017-08-15 Dexcom, Inc. Analyte data retriever
US9916306B2 (en) 2012-10-19 2018-03-13 Sdl Inc. Statistical linguistic analysis of source content
JP2014089637A (en) * 2012-10-31 2014-05-15 International Business Maschines Corporation Method, computer, and computer program for determining translations corresponding to words or phrases in image data to be translated differently
US9330402B2 (en) 2012-11-02 2016-05-03 Intuit Inc. Method and system for providing a payroll preparation platform with user contribution-based plug-ins
WO2014074629A1 (en) 2012-11-06 2014-05-15 Intuit Inc. Stack-based adaptive localization and internationalization of applications
US10638221B2 (en) 2012-11-13 2020-04-28 Adobe Inc. Time interval sound alignment
US10249321B2 (en) * 2012-11-20 2019-04-02 Adobe Inc. Sound rate modification
US9152622B2 (en) 2012-11-26 2015-10-06 Language Weaver, Inc. Personalized machine translation via online adaptation
US10455219B2 (en) 2012-11-30 2019-10-22 Adobe Inc. Stereo correspondence and depth sensors
US9170826B2 (en) * 2012-12-17 2015-10-27 Sap Se Common multi-language text management in a business-oriented software framework
US11222362B2 (en) * 2013-01-15 2022-01-11 Motionpoint Corporation Dynamic determination of localization source for web site content
US9591052B2 (en) 2013-02-05 2017-03-07 Apple Inc. System and method for providing a content distribution network with data quality monitoring and management
US9906615B1 (en) 2013-02-28 2018-02-27 Open Text Sa Ulc System and method for selective activation of site features
US9262405B1 (en) * 2013-02-28 2016-02-16 Google Inc. Systems and methods of serving a content item to a user in a specific language
US9519642B2 (en) 2013-02-28 2016-12-13 Open Text Sa Ulc System, method and computer program product for multilingual content management
US9658999B2 (en) * 2013-03-01 2017-05-23 Sony Corporation Language processing method and electronic device
US9363329B1 (en) * 2013-03-15 2016-06-07 Instart Logic, Inc. Identifying correlated components of dynamic content
US9298455B1 (en) 2013-03-15 2016-03-29 Instart Logic, Inc. Provisional execution of dynamic content component
US9916295B1 (en) * 2013-03-15 2018-03-13 Richard Henry Dana Crawford Synchronous context alignments
US9342499B2 (en) * 2013-03-19 2016-05-17 Educational Testing Service Round-trip translation for automated grammatical error correction
US9858052B2 (en) * 2013-03-21 2018-01-02 Razer (Asia-Pacific) Pte. Ltd. Decentralized operating system
EP2784663A1 (en) * 2013-03-26 2014-10-01 Kiss, Laszlo Method system and computer program product for collecting, sending and following language requests for mobile applications
US9977684B2 (en) 2013-06-12 2018-05-22 Sap Se Self-learning localization service
CN104346153B (en) 2013-07-31 2018-04-17 国际商业机器公司 Method and system for the text message of translation application
US9747267B2 (en) * 2013-08-12 2017-08-29 Adobe Systems Incorporated Document editing synchronization
US9922351B2 (en) 2013-08-29 2018-03-20 Intuit Inc. Location-based adaptation of financial management system
US9213694B2 (en) 2013-10-10 2015-12-15 Language Weaver, Inc. Efficient online domain adaptation
KR20150050947A (en) * 2013-11-01 2015-05-11 삼성전자주식회사 Method and apparatus for translation
KR101740332B1 (en) * 2013-11-05 2017-06-08 한국전자통신연구원 Apparatus and method for automatic tlanslation
US20150254236A1 (en) * 2014-03-13 2015-09-10 Michael Lewis Moravitz Translation software built into internet
US20150261880A1 (en) * 2014-03-15 2015-09-17 Google Inc. Techniques for translating user interfaces of web-based applications
US9524294B2 (en) * 2014-04-10 2016-12-20 Institut Fur Rundfunktechnik Gmbh Circuitry for a commentator and/or simultaneous translator system, operating unit and commentator and/or simultaneous translator system
US9235569B1 (en) 2014-06-26 2016-01-12 Google Inc. Techniques for on-the-spot translation of web-based applications without annotating user interface strings
WO2016018004A1 (en) * 2014-07-31 2016-02-04 Samsung Electronics Co., Ltd. Method, apparatus, and system for providing translated content
US20160043982A1 (en) * 2014-08-11 2016-02-11 Facebook, Inc. Techniques for a sequential message reader for message syncing
US9465719B2 (en) * 2014-08-12 2016-10-11 Red Hat, Inc. Localized representation of stack traces
HUE043847T2 (en) 2014-08-28 2019-09-30 Halozyme Inc Combination therapy with a hyaluronan-degrading enzyme and an immune checkpoint inhibitor
US9817808B2 (en) * 2014-09-29 2017-11-14 International Business Machines Corporation Translation using related term pairs
JP6320982B2 (en) * 2014-11-26 2018-05-09 ネイバー コーポレーションNAVER Corporation Translated sentence editor providing apparatus and translated sentence editor providing method
US10261996B2 (en) 2014-12-19 2019-04-16 Dropbox, Inc. Content localization using fallback translations
US10425464B2 (en) 2015-01-08 2019-09-24 Instart Logic, Inc. Adaptive learning periods in HTML streaming
US10248537B2 (en) 2015-04-28 2019-04-02 Microsoft Technology Licensing, Llc Translation bug prediction classifier
US9430466B1 (en) * 2015-08-26 2016-08-30 Google Inc. Techniques for crowd sourcing human translations to provide translated versions of web pages with additional content
US10078504B1 (en) * 2015-09-16 2018-09-18 Amazon Technologies, Inc. Automated software internationalization and localization
US10075482B2 (en) * 2015-09-25 2018-09-11 International Business Machines Corporation Multiplexed, multimodal conferencing
US9830384B2 (en) 2015-10-29 2017-11-28 International Business Machines Corporation Foreign organization name matching
US9690777B1 (en) * 2015-12-10 2017-06-27 Webinterpret Translating website listings and propagating the translated listings to listing websites in other regions
US20170168999A1 (en) * 2015-12-14 2017-06-15 International Business Machines Corporation Translating web applications based on a context model
US9659010B1 (en) * 2015-12-28 2017-05-23 International Business Machines Corporation Multiple language screen capture
US10706069B2 (en) 2016-06-30 2020-07-07 Facebook, Inc. Techniques for replication of a client database to remote devices
US10261995B1 (en) 2016-09-28 2019-04-16 Amazon Technologies, Inc. Semantic and natural language processing for content categorization and routing
US10229113B1 (en) 2016-09-28 2019-03-12 Amazon Technologies, Inc. Leveraging content dimensions during the translation of human-readable languages
US10235362B1 (en) 2016-09-28 2019-03-19 Amazon Technologies, Inc. Continuous translation refinement with automated delivery of re-translated content
US10275459B1 (en) * 2016-09-28 2019-04-30 Amazon Technologies, Inc. Source language content scoring for localizability
US10223356B1 (en) 2016-09-28 2019-03-05 Amazon Technologies, Inc. Abstraction of syntax in localization through pre-rendering
US10706033B2 (en) 2016-10-21 2020-07-07 Open Text Sa Ulc Content management system and method for managing ad-hoc collections of content
US11403078B2 (en) * 2016-10-21 2022-08-02 Vmware, Inc. Interface layout interference detection
US11516300B2 (en) * 2016-11-30 2022-11-29 Hughes Network Systems, Llc System, method and program for localizing web page interfaces via asynchronous data and automatic binding
US10235361B2 (en) 2017-02-15 2019-03-19 International Business Machines Corporation Context-aware translation memory to facilitate more accurate translation
WO2018182316A2 (en) * 2017-03-28 2018-10-04 한양대학교 산학협력단 Method for detecting web-based nucleic acid double-strand break position and electronic device using same
US10437935B2 (en) 2017-04-18 2019-10-08 Salesforce.Com, Inc. Natural language translation and localization
US10489513B2 (en) 2017-04-19 2019-11-26 Salesforce.Com, Inc. Web application localization
US10795799B2 (en) * 2017-04-18 2020-10-06 Salesforce.Com, Inc. Website debugger for natural language translation and localization
CN108804471A (en) * 2017-05-04 2018-11-13 北大方正集团有限公司 webpage generating method and device
US10635863B2 (en) 2017-10-30 2020-04-28 Sdl Inc. Fragment recall and adaptive automated translation
US10817676B2 (en) 2017-12-27 2020-10-27 Sdl Inc. Intelligent routing services and systems
US11153254B2 (en) * 2018-01-02 2021-10-19 International Business Machines Corporation Meme intelligent conversion
US20210334476A1 (en) * 2018-01-25 2021-10-28 Hewlett-Packard Development Company, L.P. Language-neutral translation memories
WO2019163117A1 (en) * 2018-02-26 2019-08-29 Loveland株式会社 Webpage translation system, webpage translation device, webpage provision device, and webpage translation method
WO2019210977A1 (en) * 2018-05-04 2019-11-07 Telefonaktiebolaget Lm Ericsson (Publ) Methods and apparatus for enriching entities with alternative texts in multiple languages
US10540452B1 (en) * 2018-06-21 2020-01-21 Amazon Technologies, Inc. Automated translation of applications
US10795686B2 (en) * 2018-08-31 2020-10-06 International Business Machines Corporation Internationalization controller
US11256867B2 (en) 2018-10-09 2022-02-22 Sdl Inc. Systems and methods of machine learning for digital assets and message creation
US11546403B2 (en) * 2018-12-26 2023-01-03 Wipro Limited Method and system for providing personalized content to a user
US11507966B2 (en) 2019-02-07 2022-11-22 Dell Products L.P. Multi-region document revision model with correction factor
CN110442879B (en) * 2019-04-30 2024-02-13 华为技术有限公司 Content translation method and terminal
US11397600B2 (en) * 2019-05-23 2022-07-26 HCL Technologies Italy S.p.A Dynamic catalog translation system
WO2020257970A1 (en) * 2019-06-24 2020-12-30 Citrix Systems, Inc. Previewing application user interface for multiple locales
US11227101B2 (en) * 2019-07-05 2022-01-18 Open Text Sa Ulc System and method for document translation in a format agnostic document viewer
US11164194B2 (en) * 2019-08-20 2021-11-02 Shopify Inc. Ecommerce storefront marketing channel synchronization management
US11037207B2 (en) 2019-08-20 2021-06-15 Shopify Inc. Channel synchronization engine with call control
CN110737431B (en) * 2019-09-18 2023-07-14 深圳市金证科技股份有限公司 Software development method, development platform, terminal device and storage medium
CN110704154B (en) * 2019-10-12 2023-05-12 杭州行至云起科技有限公司 Multi-language template release method and system
CN113284052A (en) 2020-02-19 2021-08-20 阿里巴巴集团控股有限公司 Image processing method and apparatus
US11494567B2 (en) * 2020-03-03 2022-11-08 Dell Products L.P. Content adaptation techniques for localization of content presentation
US11443122B2 (en) * 2020-03-03 2022-09-13 Dell Products L.P. Image analysis-based adaptation techniques for localization of content presentation
KR102415923B1 (en) * 2020-03-04 2022-07-04 김경철 Method for managing translation platform
US11580312B2 (en) 2020-03-16 2023-02-14 Servicenow, Inc. Machine translation of chat sessions
US11385916B2 (en) 2020-03-16 2022-07-12 Servicenow, Inc. Dynamic translation of graphical user interfaces
US11687732B2 (en) * 2020-04-06 2023-06-27 Open Text Holdings, Inc. Content management systems for providing automated translation of content items
US11449518B2 (en) * 2020-04-08 2022-09-20 Capital One Services, Llc Neural network-based document searching system
CN111729313A (en) * 2020-05-06 2020-10-02 完美世界(北京)软件科技发展有限公司 Language configuration method and device, storage medium and electronic device
US11392768B2 (en) 2020-05-07 2022-07-19 Servicenow, Inc. Hybrid language detection model
EP4430515A1 (en) * 2021-11-08 2024-09-18 AIRBNB, Inc. Selective pre-translation of web content
US20240265042A1 (en) * 2023-02-08 2024-08-08 Honeywell International Inc. Systems and methods for multi-language text indexing and search

Citations (12)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US6526426B1 (en) * 1998-02-23 2003-02-25 David Lakritz Translation management system
US6623529B1 (en) * 1998-02-23 2003-09-23 David Lakritz Multilingual electronic document translation, management, and delivery system
US20040102956A1 (en) * 2002-11-22 2004-05-27 Levin Robert E. Language translation system and method
US7240279B1 (en) * 2002-06-19 2007-07-03 Microsoft Corporation XML patterns language
US7263656B2 (en) * 2001-07-16 2007-08-28 Canon Kabushiki Kaisha Method and device for scheduling, generating and processing a document comprising blocks of information
US7275208B2 (en) * 2002-02-21 2007-09-25 International Business Machines Corporation XML document processing for ascertaining match of a structure type definition
US7346652B2 (en) * 2002-05-13 2008-03-18 First Data Corporation Asynchronous data validation
US7383387B2 (en) * 2002-12-13 2008-06-03 Sap Ag Document transformation tool
US7441184B2 (en) * 2000-05-26 2008-10-21 Bull S.A. System and method for internationalizing the content of markup documents in a computer system
US7451232B1 (en) * 2000-05-25 2008-11-11 Microsoft Corporation Method for request and response direct data transfer and management of content manifests
US7613993B1 (en) * 2000-01-21 2009-11-03 International Business Machines Corporation Prerequisite checking in a system for creating compilations of content
US8196135B2 (en) * 2000-07-21 2012-06-05 Deltaxml, Limited Method of and software for recordal and validation of changes to markup language files

Family Cites Families (140)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JPS58101365A (en) * 1981-12-14 1983-06-16 Hitachi Ltd Text display calibration system in machine translation system
JP2831647B2 (en) 1988-03-31 1998-12-02 株式会社東芝 Machine translation system
ES2143509T3 (en) 1992-09-04 2000-05-16 Caterpillar Inc INTEGRATED EDITION AND TRANSLATION SYSTEM.
US5608622A (en) * 1992-09-11 1997-03-04 Lucent Technologies Inc. System for analyzing translations
AU5852896A (en) 1995-05-05 1996-11-21 Apple Computer, Inc. Method and apparatus for managing text objects
WO1996041281A1 (en) * 1995-06-07 1996-12-19 International Language Engineering Corporation Machine assisted translation tools
GB9513379D0 (en) 1995-06-30 1995-09-06 Jonhig Ltd Electronic purse system
US6073143A (en) 1995-10-20 2000-06-06 Sanyo Electric Co., Ltd. Document conversion system including data monitoring means that adds tag information to hyperlink information and translates a document when such tag information is included in a document retrieval request
US6993471B1 (en) 1995-11-13 2006-01-31 America Online, Inc. Integrated multilingual browser
US5835192A (en) 1995-12-21 1998-11-10 Johnson & Johnson Vision Products, Inc. Contact lenses and method of fitting contact lenses
US5974372A (en) 1996-02-12 1999-10-26 Dst Systems, Inc. Graphical user interface (GUI) language translator
US5855020A (en) * 1996-02-21 1998-12-29 Infoseek Corporation Web scan process
US5864852A (en) * 1996-04-26 1999-01-26 Netscape Communications Corporation Proxy server caching mechanism that provides a file directory structure and a mapping mechanism within the file directory structure
JP3121548B2 (en) 1996-10-15 2001-01-09 インターナショナル・ビジネス・マシーンズ・コーポレ−ション Machine translation method and apparatus
US5956740A (en) 1996-10-23 1999-09-21 Iti, Inc. Document searching system for multilingual documents
KR19980055170A (en) 1996-12-28 1998-09-25 김영귀 Vehicle Hazard Warning Device
KR19980055170U (en) 1996-12-31 1998-10-07 박병재 Input shaft oil ring tear prevention structure of automobile transmission
US6065026A (en) 1997-01-09 2000-05-16 Document.Com, Inc. Multi-user electronic document authoring system with prompted updating of shared language
US5898836A (en) * 1997-01-14 1999-04-27 Netmind Services, Inc. Change-detection tool indicating degree and location of change of internet documents by comparison of cyclic-redundancy-check(CRC) signatures
WO1998056068A1 (en) * 1997-06-02 1998-12-10 Ntt Mobile Communications Network Inc. Adaptive array antenna
EP0867815A3 (en) 1997-03-26 2000-05-31 Kabushiki Kaisha Toshiba Translation service providing method and translation service system
IL121071A0 (en) 1997-03-27 1997-11-20 El Mar Software Ltd Automatic conversion server
US6691279B2 (en) * 1997-03-31 2004-02-10 Sanyo Electric Co., Ltd Document preparation method and machine translation device
US5991710A (en) 1997-05-20 1999-11-23 International Business Machines Corporation Statistical translation system with features based on phrases or groups of words
US6112240A (en) 1997-09-03 2000-08-29 International Business Machines Corporation Web site client information tracker
US6161082A (en) 1997-11-18 2000-12-12 At&T Corp Network based language translation system
US6349275B1 (en) * 1997-11-24 2002-02-19 International Business Machines Corporation Multiple concurrent language support system for electronic catalogue using a concept based knowledge representation
US6122666A (en) 1998-02-23 2000-09-19 International Business Machines Corporation Method for collaborative transformation and caching of web objects in a proxy network
US6959318B1 (en) * 1998-03-06 2005-10-25 Intel Corporation Method of proxy-assisted predictive pre-fetching with transcoding
US6076108A (en) 1998-03-06 2000-06-13 I2 Technologies, Inc. System and method for maintaining a state for a user session using a web system having a global session server
US6163765A (en) * 1998-03-30 2000-12-19 Motorola, Inc. Subband normalization, transformation, and voiceness to recognize phonemes for text messaging in a radio communication system
US7020601B1 (en) 1998-05-04 2006-03-28 Trados Incorporated Method and apparatus for processing source information based on source placeable elements
US6345243B1 (en) * 1998-05-27 2002-02-05 Lionbridge Technologies, Inc. System, method, and product for dynamically propagating translations in a translation-memory system
US6154158A (en) * 1998-06-30 2000-11-28 Qualcomm Incorporated Digital-to-analog converter D.C. offset correction comparing converter input and output signals
US6526416B1 (en) * 1998-06-30 2003-02-25 Microsoft Corporation Compensating resource managers
WO2000005660A1 (en) 1998-07-23 2000-02-03 Logovista Corporation Modular language translation system
US7079584B2 (en) * 1998-08-10 2006-07-18 Kamilo Feher OFDM, CDMA, spread spectrum, TDMA, cross-correlated and filtered modulation
US6826593B1 (en) * 1998-09-01 2004-11-30 Lucent Technologies Inc. Computer implemented method and apparatus for fulfilling a request for information content with a user-selectable version of a file containing that information content
US6144310A (en) * 1999-01-26 2000-11-07 Morris; Gary Jay Environmental condition detector with audible alarm and voice identifier
US6349276B1 (en) 1998-10-29 2002-02-19 International Business Machines Corporation Multilingual information retrieval with a transfer corpus
US6347316B1 (en) * 1998-12-14 2002-02-12 International Business Machines Corporation National language proxy file save and incremental cache translation option for world wide web documents
KR20000039748A (en) 1998-12-15 2000-07-05 정선종 Apparatus for translating web documents written in multi-languages and method for translating service using the apparatus
US6263195B1 (en) * 1999-02-12 2001-07-17 Trw Inc. Wideband parallel processing digital tuner
US6338033B1 (en) 1999-04-20 2002-01-08 Alis Technologies, Inc. System and method for network-based teletranslation from one natural language to another
US6446036B1 (en) * 1999-04-20 2002-09-03 Alis Technologies, Inc. System and method for enhancing document translatability
US6286006B1 (en) 1999-05-07 2001-09-04 Alta Vista Company Method and apparatus for finding mirrored hosts by analyzing urls
US7607085B1 (en) 1999-05-11 2009-10-20 Microsoft Corporation Client side localizations on the world wide web
AUPQ141999A0 (en) * 1999-07-05 1999-07-29 Worldlingo.Com Pty Ltd Communication processing system
US6799020B1 (en) * 1999-07-20 2004-09-28 Qualcomm Incorporated Parallel amplifier architecture using digital phase control techniques
CN1176432C (en) * 1999-07-28 2004-11-17 国际商业机器公司 Method and system for providing national language inquiry service
US7110938B1 (en) 1999-09-17 2006-09-19 Trados, Inc. E-services translation portal system
CN1173282C (en) 1999-09-20 2004-10-27 国际商业机器公司 Method and system for dynamically increasiing new functions for www. page
US6393389B1 (en) * 1999-09-23 2002-05-21 Xerox Corporation Using ranked translation choices to obtain sequences indicating meaning of multi-token expressions
US6662233B1 (en) 1999-09-23 2003-12-09 Intel Corporation System dynamically translates translation information corresponding to a version of a content element having a bandwidth corresponding to bandwidth capability of a recipient
US7383320B1 (en) 1999-11-05 2008-06-03 Idom Technologies, Incorporated Method and apparatus for automatically updating website content
US7016977B1 (en) 1999-11-05 2006-03-21 International Business Machines Corporation Method and system for multilingual web server
JP2001167092A (en) 1999-12-13 2001-06-22 Nec Corp Translation server system
JP2001175683A (en) 1999-12-21 2001-06-29 Nec Corp Translation server system
AUPQ539700A0 (en) 2000-02-02 2000-02-24 Worldlingo.Com Pty Ltd Translation ordering system
US7216072B2 (en) 2000-02-29 2007-05-08 Fujitsu Limited Relay device, server device, terminal device, and translation server system utilizing these devices
AU2001245562A1 (en) 2000-03-10 2001-10-30 The One.Com System and method for providing interactive translation of information in a communication network
GB0006153D0 (en) * 2000-03-14 2000-05-03 Inpharmatica Ltd Database
US20020002452A1 (en) 2000-03-28 2002-01-03 Christy Samuel T. Network-based text composition, translation, and document searching
US20010029455A1 (en) * 2000-03-31 2001-10-11 Chin Jeffrey J. Method and apparatus for providing multilingual translation over a network
EP1139231A1 (en) 2000-03-31 2001-10-04 Fujitsu Limited Document processing apparatus and method
JP2001282732A (en) 2000-04-03 2001-10-12 Komatsu Ltd Method and system for providing service to distant user through inter-computer communication
AU6352201A (en) 2000-04-15 2001-10-30 Morrison, Ian Apparatus for generating electric and/or magnetic fields and detecting and usingsuch fields
TW531901B (en) * 2000-04-27 2003-05-11 Semiconductor Energy Lab Light emitting device
US6604101B1 (en) 2000-06-28 2003-08-05 Qnaturally Systems, Inc. Method and system for translingual translation of query and search and retrieval of multilingual information on a computer network
US6865716B1 (en) 2000-05-05 2005-03-08 Aspect Communication Corporation Method and apparatus for dynamic localization of documents
US7199740B1 (en) * 2000-05-21 2007-04-03 Analog Devices, Inc. Method and apparatus for use in switched capacitor systems
WO2001093089A1 (en) 2000-05-26 2001-12-06 Theone.Com System and method for providing interactive translation of information in a communication network
JP3507768B2 (en) * 2000-05-29 2004-03-15 ホシデン株式会社 keyboard
EP1311971A1 (en) * 2000-06-23 2003-05-21 Advisortech Corporation Apparatus and method of providing multilingual content in an online environment
JP4011268B2 (en) 2000-07-05 2007-11-21 株式会社アイアイエス Multilingual translation system
US7389221B1 (en) 2000-07-17 2008-06-17 Globalenglish Corporation System and method for interactive translation
US6829311B1 (en) * 2000-09-19 2004-12-07 Kaben Research Inc. Complex valued delta sigma phase locked loop demodulator
US20020111787A1 (en) 2000-10-13 2002-08-15 Iko Knyphausen Client-driven workload environment
WO2002033607A1 (en) 2000-10-16 2002-04-25 Iis Inc. Method for offering multilingual information translated in many languages through a communication network
US20020065946A1 (en) * 2000-10-17 2002-05-30 Shankar Narayan Synchronized computing with internet widgets
US20020083068A1 (en) * 2000-10-30 2002-06-27 Quass Dallan W. Method and apparatus for filling out electronic forms
US6980953B1 (en) 2000-10-31 2005-12-27 International Business Machines Corp. Real-time remote transcription or translation service
US6859820B1 (en) 2000-11-01 2005-02-22 Microsoft Corporation System and method for providing language localization for server-based applications
US7139898B1 (en) * 2000-11-03 2006-11-21 Mips Technologies, Inc. Fetch and dispatch disassociation apparatus for multistreaming processors
US7418390B1 (en) 2000-11-20 2008-08-26 Yahoo! Inc. Multi-language system for online communications
US6665642B2 (en) * 2000-11-29 2003-12-16 Ibm Corporation Transcoding system and method for improved access by users with special needs
US8452850B2 (en) * 2000-12-14 2013-05-28 International Business Machines Corporation Method, apparatus and computer program product to crawl a web site
US20020091509A1 (en) 2001-01-02 2002-07-11 Yacov Zoarez Method and system for translating text
JP2002215621A (en) 2001-01-19 2002-08-02 Nec Corp Translation server, translation method and program
US6964014B1 (en) 2001-02-15 2005-11-08 Networks Associates Technology, Inc. Method and system for localizing Web pages
AUPR329501A0 (en) 2001-02-22 2001-03-22 Worldlingo, Inc Translation information segment
US20020123879A1 (en) 2001-03-01 2002-09-05 Donald Spector Translation system & method
AUPR360701A0 (en) * 2001-03-06 2001-04-05 Worldlingo, Inc Seamless translation system
WO2002073464A1 (en) 2001-03-09 2002-09-19 The One.Com System and method for providing efficient and accurate translation of information in a communication network
US20020133523A1 (en) * 2001-03-16 2002-09-19 Anthony Ambler Multilingual graphic user interface system and method
US20020165885A1 (en) 2001-05-03 2002-11-07 International Business Machines Corporation Method and system for verifying translation of localized messages for an internationalized application
US20030004703A1 (en) 2001-06-28 2003-01-02 Arvind Prabhakar Method and system for localizing a markup language document
WO2003004444A1 (en) 2001-07-02 2003-01-16 Exxonmobil Chemical Patents Inc. Inhibiting catalyst coke formation in the manufacture of an olefin
US6867693B1 (en) * 2001-07-25 2005-03-15 Lon B. Radin Spatial position determination system
US7793326B2 (en) * 2001-08-03 2010-09-07 Comcast Ip Holdings I, Llc Video and digital multimedia aggregator
EP1288793A1 (en) 2001-08-27 2003-03-05 Sony NetServices GmbH Translation text management system
JP2003063101A (en) * 2001-08-27 2003-03-05 Matsushita Graphic Communication Systems Inc Composite machine, terminal for connection therewith and network system comprising them
US6993473B2 (en) * 2001-08-31 2006-01-31 Equality Translation Services Productivity tool for language translators
US20030084401A1 (en) 2001-10-16 2003-05-01 Abel Todd J. Efficient web page localization
US20030115552A1 (en) 2001-11-27 2003-06-19 Jorg Jahnke Method and system for automatic creation of multilingual immutable image files
US20030105621A1 (en) * 2001-12-04 2003-06-05 Philippe Mercier Method for computer-assisted translation
KR100539929B1 (en) * 2001-12-15 2005-12-28 삼성전자주식회사 Digital frequency modulator
US20030120478A1 (en) * 2001-12-21 2003-06-26 Robert Palmquist Network-based translation system
US6869820B2 (en) 2002-01-30 2005-03-22 United Epitaxy Co., Ltd. High efficiency light emitting diode and method of making the same
US7412374B1 (en) * 2002-01-30 2008-08-12 Novell, Inc. Method to dynamically determine a user's language for a network
US20030154071A1 (en) * 2002-02-11 2003-08-14 Shreve Gregory M. Process for the document management and computer-assisted translation of documents utilizing document corpora constructed by intelligent agents
US6639534B2 (en) * 2002-02-14 2003-10-28 Silicon Laboratories, Inc. Digital-to-analog converter switching circuitry
JP3809863B2 (en) 2002-02-28 2006-08-16 インターナショナル・ビジネス・マシーンズ・コーポレーション server
JP2003296223A (en) 2002-03-29 2003-10-17 Fuji Xerox Co Ltd Method and device, and program for providing web page information
US20030204573A1 (en) 2002-04-30 2003-10-30 Andre Beck Method of providing a web user with additional context-specific information
US7308399B2 (en) 2002-06-20 2007-12-11 Siebel Systems, Inc. Searching for and updating translations in a terminology database
US7110937B1 (en) 2002-06-20 2006-09-19 Siebel Systems, Inc. Translation leveraging
US7313511B2 (en) 2002-08-21 2007-12-25 California Institute Of Technology Method and apparatus for computer simulation of flight test beds
US7113960B2 (en) 2002-08-22 2006-09-26 International Business Machines Corporation Search on and search for functions in applications with varying data types
US20040049374A1 (en) 2002-09-05 2004-03-11 International Business Machines Corporation Translation aid for multilingual Web sites
US7634728B2 (en) 2002-12-28 2009-12-15 International Business Machines Corporation System and method for providing a runtime environment for active web based document resources
US7627817B2 (en) * 2003-02-21 2009-12-01 Motionpoint Corporation Analyzing web site for translation
US7536293B2 (en) 2003-02-24 2009-05-19 Microsoft Corporation Methods and systems for language translation
US6778117B1 (en) * 2003-02-28 2004-08-17 Silicon Laboratories, Inc. Local oscillator and mixer for a radio frequency receiver and related method
US7675996B2 (en) * 2003-02-28 2010-03-09 Johnson Richard A Television receiver suitable for multi-standard operation and method therefor
US7425995B2 (en) * 2003-02-28 2008-09-16 Silicon Laboratories, Inc. Tuner using a direct digital frequency synthesizer, television receiver using such a tuner, and method therefor
US7447493B2 (en) * 2003-02-28 2008-11-04 Silicon Laboratories, Inc. Tuner suitable for integration and method for tuning a radio frequency signal
US6950047B1 (en) * 2004-03-31 2005-09-27 Silicon Labs Cp, Inc. Method and apparatus for combining outputs of multiple DACs for increased bit resolution
EP1749096B1 (en) * 2004-05-28 2013-07-17 Mologen AG Method for the production of suitable dna constructs for specific inhibition of gene expression by rna interference
US20050283473A1 (en) 2004-06-17 2005-12-22 Armand Rousso Apparatus, method and system of artificial intelligence for data searching applications
US7183958B2 (en) * 2004-09-08 2007-02-27 M/A-Com, Eurotec B.V. Sub-ranging digital to analog converter for radiofrequency amplification
KR100639953B1 (en) * 2004-09-24 2006-11-01 주식회사 효성 Superconducting wire transposition method and superconducting transformer using the same
US20070033520A1 (en) 2005-08-08 2007-02-08 Kimzey Ann M System and method for web page localization
US7388387B2 (en) 2006-01-11 2008-06-17 Stratosphere Solutions, Inc. Method and apparatus for measurement of electrical resistance
US7612712B2 (en) 2006-04-25 2009-11-03 Rx Networks Inc. Distributed orbit modeling and propagation method for a predicted and real-time assisted GPS system
US20070255554A1 (en) 2006-04-26 2007-11-01 Lucent Technologies Inc. Language translation service for text message communications
MX2010010458A (en) * 2008-03-26 2010-10-20 Ngk Insulators Ltd Device and method for producing sealed honeycomb structure.
WO2020067139A1 (en) 2018-09-25 2020-04-02 積水化学工業株式会社 Mixed liquid agent, polyurethane composition, polyurethane foam, spray can, and mixing system
JP6611060B1 (en) 2018-10-02 2019-11-27 株式会社コナミアミューズメント Game system
CN109166550B (en) 2018-10-10 2020-12-29 惠科股份有限公司 Display device driving method and display device

Patent Citations (12)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US6526426B1 (en) * 1998-02-23 2003-02-25 David Lakritz Translation management system
US6623529B1 (en) * 1998-02-23 2003-09-23 David Lakritz Multilingual electronic document translation, management, and delivery system
US7613993B1 (en) * 2000-01-21 2009-11-03 International Business Machines Corporation Prerequisite checking in a system for creating compilations of content
US7451232B1 (en) * 2000-05-25 2008-11-11 Microsoft Corporation Method for request and response direct data transfer and management of content manifests
US7441184B2 (en) * 2000-05-26 2008-10-21 Bull S.A. System and method for internationalizing the content of markup documents in a computer system
US8196135B2 (en) * 2000-07-21 2012-06-05 Deltaxml, Limited Method of and software for recordal and validation of changes to markup language files
US7263656B2 (en) * 2001-07-16 2007-08-28 Canon Kabushiki Kaisha Method and device for scheduling, generating and processing a document comprising blocks of information
US7275208B2 (en) * 2002-02-21 2007-09-25 International Business Machines Corporation XML document processing for ascertaining match of a structure type definition
US7346652B2 (en) * 2002-05-13 2008-03-18 First Data Corporation Asynchronous data validation
US7240279B1 (en) * 2002-06-19 2007-07-03 Microsoft Corporation XML patterns language
US20040102956A1 (en) * 2002-11-22 2004-05-27 Levin Robert E. Language translation system and method
US7383387B2 (en) * 2002-12-13 2008-06-03 Sap Ag Document transformation tool

Also Published As

Publication number Publication date
US20170212888A1 (en) 2017-07-27
US20110209038A1 (en) 2011-08-25
US20140058719A1 (en) 2014-02-27
US8949223B2 (en) 2015-02-03
US20040167768A1 (en) 2004-08-26
US8065294B2 (en) 2011-11-22
US7584216B2 (en) 2009-09-01
US20180210877A1 (en) 2018-07-26
US20040237044A1 (en) 2004-11-25
US8566710B2 (en) 2013-10-22
US7580960B2 (en) 2009-08-25
US11308288B2 (en) 2022-04-19
US20090281790A1 (en) 2009-11-12
US20040168132A1 (en) 2004-08-26
US7627479B2 (en) 2009-12-01
US20160267076A1 (en) 2016-09-15
US9626360B2 (en) 2017-04-18
US20040167784A1 (en) 2004-08-26
US10621287B2 (en) 2020-04-14
US10409918B2 (en) 2019-09-10
US20100030550A1 (en) 2010-02-04
US20150106077A1 (en) 2015-04-16
US8433718B2 (en) 2013-04-30
US20100169764A1 (en) 2010-07-01
US7627817B2 (en) 2009-12-01
US20100174525A1 (en) 2010-07-08
US9910853B2 (en) 2018-03-06
US20130211817A1 (en) 2013-08-15
US7996417B2 (en) 2011-08-09
US9367540B2 (en) 2016-06-14
US20190340248A1 (en) 2019-11-07
US9652455B2 (en) 2017-05-16

Similar Documents

Publication Publication Date Title
US11308288B2 (en) Automation tool for web site content language translation
US10977329B2 (en) Dynamic language translation of web site content

Legal Events

Date Code Title Description
AS Assignment

Owner name: MOTIONPOINT CORPORATION, FLORIDA

Free format text: ASSIGNMENT OF ASSIGNORS INTEREST;ASSIGNORS:TRAVIESO, ENRIQUE;RUBENSTEIN, ADAM;FLEMING, WILLIAM;REEL/FRAME:041443/0453

Effective date: 20030221

STPP Information on status: patent application and granting procedure in general

Free format text: DOCKETED NEW CASE - READY FOR EXAMINATION

STPP Information on status: patent application and granting procedure in general

Free format text: NON FINAL ACTION MAILED

STPP Information on status: patent application and granting procedure in general

Free format text: RESPONSE TO NON-FINAL OFFICE ACTION ENTERED AND FORWARDED TO EXAMINER

STPP Information on status: patent application and granting procedure in general

Free format text: FINAL REJECTION MAILED

STCB Information on status: application discontinuation

Free format text: ABANDONED -- FAILURE TO RESPOND TO AN OFFICE ACTION