JP6281329B2

JP6281329B2 - Image processing device

Info

Publication number: JP6281329B2
Application number: JP2014044338A
Authority: JP
Inventors: 良幸田中; 近藤　真樹; 真樹近藤; 良平小澤; 長谷川　智彦; 智彦長谷川
Original assignee: Brother Industries Ltd
Current assignee: Brother Industries Ltd
Priority date: 2014-03-06
Filing date: 2014-03-06
Publication date: 2018-02-21
Anticipated expiration: 2034-03-06
Also published as: JP2015170981A

Description

本明細書では、複数個の文字と修飾物とを含む原画像を表わす原画像データを利用して、複数個の文字と修飾物とが原画像とは異なる状態で再配置されている対象画像を表わす対象画像データを生成する画像処理装置を開示する。 In this specification, a target image in which a plurality of characters and modifiers are rearranged in a state different from the original image using original image data representing an original image including a plurality of characters and modifiers. An image processing device for generating target image data representing

特許文献７には、文書を含む原稿を読み取って画像データを生成し、文書の書式（行数、列数、文字サイズ等）を変更して印刷する画像形成装置が開示されている。画像形成装置は、文書内の個々の文字を１個の矩形領域として扱って、複数個の文字を再配置することによって文書の書式を変更する。特に、画像形成装置は、日本語の１個の漢字にルビが付されている場合には、１個の漢字とルビとを統合して１個の文字として扱う。 Patent Document 7 discloses an image forming apparatus that reads a document including a document to generate image data, changes the format (number of rows, number of columns, character size, etc.) of the document, and prints it. The image forming apparatus treats each character in the document as one rectangular area, and changes the format of the document by rearranging the plurality of characters. In particular, the image forming apparatus treats one kanji and ruby as one character when ruby is attached to one kanji in Japanese.

特開２０１２−１０８７５０号公報JP 2012-108750 A 特開２０１２−２３０６２３号公報JP 2012-230623 A 特開２０１１−２４２９８７号公報JP 2011-242987 A 特開２０１０−１８３４８４号公報JP 2010-183484 A 特開２００５−２２３８２４号公報JP-A-2005-223824 特開平５−９４５１１号公報JP-A-5-94511 特開２０００−１３７８０１号公報JP 2000-137801 A 特開平１１−２５２８３号公報Japanese Patent Laid-Open No. 11-25283 特開２０１２−２１６０３８号公報JP 2012-216038 A

“画面サイズに合わせ自動的に改行！文書ファイルをスマートフォン上で読みやすく表示レイアウト再構築技術「GT-Layout」オンラインストレージ「Dropbox」向けのサービスをスタート新開発”、［online］、２０１２年５月３０日、富士フィルム株式会社、［２０１４年１月２４日検索］、インターネット＜http://www.fujifilm.co.jp/corporate/news/articleffnr_0647.html＞"Line breaks automatically according to screen size! Displaying document files easily on smartphones Layout reconstruction technology" GT-Layout "Online storage" Dropbox "service newly developed", [online], May 2012 30th, Fuji Film Co., Ltd. [Search January 24, 2014], Internet <http://www.fujifilm.co.jp/corporate/news/articleffnr_0647.html>

上記の特許文献７の技術では、画像形成装置は、ルビによって修飾される漢字を含む複数個の文字を再配置する際に、１個の文字を単位として処理を実行するので、複数個の文字を再配置する処理に比較的に長い時間を要し得る。本明細書では、複数個の文字を迅速に再配置し得る技術を提供する。 In the technique disclosed in Patent Document 7, the image forming apparatus executes processing in units of one character when rearranging a plurality of characters including kanji modified by ruby. It may take a relatively long time for the process of rearranging. The present specification provides a technique capable of quickly rearranging a plurality of characters.

本明細書によって開示される画像処理装置は、取得部と、決定部と、特定部と、対象画像データ生成部と、を備える。取得部は、原画像を表わす原画像データを取得する。原画像を表わす原画像データを取得する。原画像は、Ｍ行（Ｍは１以上の整数）の文字列と、Ｍ行の文字列を構成する複数個の文字のうちの被修飾文字を修飾するための修飾物と、を含む。Ｍ行の文字列のそれぞれは、第１方向に沿って並ぶ２個以上の文字によって構成される。Ｍ行の文字列は、Ｍが２以上の整数である場合に、第１方向に直交する第２方向に沿って並ぶ。修飾物は、被修飾文字の第２方向の第１側又は第２側において被修飾文字の近傍に存在する。決定部は、原画像の中から複数個の帯状領域を決定する。複数個の帯状領域は、Ｍ行の文字列を含むＭ個の主帯状領域と、修飾物を含む副帯状領域と、を含む。特定部は、複数個の帯状領域の第２方向に沿った複数個の長さに基づいて、複数個の帯状領域の中から副帯状領域を特定する。対象画像データ生成部は、副帯状領域に含まれる修飾物と、副帯状領域の第２方向の第１側又は第２側において副帯状領域の近傍に存在する近傍主帯状領域に含まれる１行の文字列と、を１行の修飾文字列として扱って、Ｍ行の文字列を構成する複数個の文字と修飾物とが原画像とは異なる状態で再配置されている対象画像を表わす対象画像データを生成する。 The image processing apparatus disclosed in this specification includes an acquisition unit, a determination unit, a specifying unit, and a target image data generation unit. The acquisition unit acquires original image data representing the original image. Original image data representing the original image is acquired. The original image includes a character string of M lines (M is an integer of 1 or more) and a modification for modifying a character to be modified among a plurality of characters constituting the character string of the M line. Each of the M rows of character strings is composed of two or more characters arranged along the first direction. The character strings of M lines are arranged along a second direction orthogonal to the first direction when M is an integer of 2 or more. The modification is present in the vicinity of the modified character on the first side or the second side in the second direction of the modified character. The determining unit determines a plurality of band-like regions from the original image. The plurality of belt-like regions include M main belt-like regions including M lines of character strings and sub-band regions containing modifiers. The specifying unit specifies a sub-band region from the plurality of band regions based on a plurality of lengths along the second direction of the plurality of band regions. The target image data generation unit includes a modification included in the sub-band region, and one row included in the vicinity main band region existing in the vicinity of the sub-band region on the first side or the second side in the second direction of the sub-band region. A character string representing a target image in which a plurality of characters constituting the M character string and a modified product are rearranged in a state different from the original image. Generate image data.

上記の構成によると、画像処理装置は、副帯状領域に含まれる修飾物と、２個以上の文字を含む近傍主帯状領域に含まれる１行の文字列と、を１行の修飾文字列として扱うので、１個の文字を単位として処理を実行せずに済む。この結果、画像処理装置は、複数個の文字と修飾物とが原画像とは異なる状態で再配置されている対象画像を表わす対象画像データを迅速に生成し得る。 According to the above configuration, the image processing apparatus uses the modification included in the sub-band region and one line of character string included in the neighboring main band region including two or more characters as one line of modification character string. Therefore, it is not necessary to execute processing for each character as a unit. As a result, the image processing apparatus can quickly generate target image data representing a target image in which a plurality of characters and modifiers are rearranged in a state different from the original image.

上記の画像処理装置を実現するための制御方法、コンピュータプログラム、及び、当該コンピュータプログラムを格納するコンピュータ読取可能記録媒体も新規で有用である。 A control method, a computer program, and a computer-readable recording medium storing the computer program for realizing the image processing apparatus are also novel and useful.

通信システムの構成を示す。1 shows a configuration of a communication system. 画像処理サーバの処理のフローチャートを示す。3 shows a flowchart of processing of an image processing server. 文字列解析処理のフローチャートを示す。The flowchart of a character string analysis process is shown. 帯状領域決定処理のフローチャートを示す。The flowchart of a strip | belt-shaped area | region determination process is shown. 修飾物解析処理のフローチャートを示す。The flowchart of a modification analysis process is shown. 結合処理のフローチャートを示す。The flowchart of a joint process is shown. 分断候補位置決定処理のフローチャートを示す。The flowchart of a division | segmentation candidate position determination process is shown. 分断候補位置決定処理の具体例を示す。The specific example of a division | segmentation candidate position determination process is shown. 再配置処理のフローチャートを示す。The flowchart of a rearrangement process is shown. 行数決定処理のフローチャートを示す。The flowchart of a line number determination process is shown. 再配置処理及び拡大処理の具体例を示す。Specific examples of the rearrangement process and the enlargement process will be described. 第２実施例の修飾物解析処理を説明するための説明図を示す。Explanatory drawing for demonstrating the modification analysis process of 2nd Example is shown. 第３実施例の修飾物解析処理を説明するための説明図を示す。Explanatory drawing for demonstrating the modification analysis process of 3rd Example is shown. 第４実施例の修飾物解析処理のフローチャートを示す。The flowchart of the modification analysis process of 4th Example is shown. 第５実施例の修飾物解析処理のフローチャートを示す。The flowchart of the modification analysis process of 5th Example is shown. 第６実施例の修飾物解析処理のフローチャートを示す。The flowchart of the modification analysis process of 6th Example is shown.

（第１実施例）
（通信システム２の構成）
図１に示されるように、通信システム２は、多機能機１０と画像処理サーバ５０とを備える。多機能機１０と画像処理サーバ５０とは、インターネット４を介して、相互に通信可能である。多機能機１０は、印刷機能、スキャン機能、コピー機能、ＦＡＸ機能等を含む多機能を実行可能な周辺機器（即ち図示省略のＰＣ（Personal Computerの略）等の周辺機器）である。画像処理サーバ５０は、多機能機１０のベンダによってインターネット４上に設けられるサーバである。 (First embodiment)
(Configuration of communication system 2)
As shown in FIG. 1, the communication system 2 includes a multi-function device 10 and an image processing server 50. The multi-function device 10 and the image processing server 50 can communicate with each other via the Internet 4. The multi-function device 10 is a peripheral device that can execute a multi-function including a print function, a scan function, a copy function, a FAX function, and the like (that is, a peripheral device such as a PC (abbreviation of personal computer) not shown). The image processing server 50 is a server provided on the Internet 4 by the vendor of the multi-function device 10.

（多機能機１０によって実行される各処理の概要）
多機能機１０が実行可能なコピー機能は、モノクロコピー機能とカラーコピー機能とに分類されるが、本実施例では、カラーコピー機能に着目して説明する。カラーコピー機能は、通常カラーコピー機能と文字拡大カラーコピー機能とに分類される。多機能機１０は、どちらのカラーコピー機能の実行指示がユーザから与えられる場合でも、まず、スキャン対象の画像を表わすシート（以下では「スキャン対象シート」と呼ぶ）をカラースキャンして、スキャン画像データＳＩＤを生成する。スキャン画像データＳＩＤは、多階調（例えば２５６階調）のＲＧＢビットマップデータである。 (Outline of each process executed by the multi-function device 10)
Copy functions that can be executed by the multi-function device 10 are classified into a monochrome copy function and a color copy function. In the present embodiment, the description will be made focusing on the color copy function. The color copy function is classified into a normal color copy function and a character enlargement color copy function. Regardless of which color copy function execution instruction is given by the user, the multi-function device 10 first performs color scanning on a sheet representing an image to be scanned (hereinafter referred to as a “scanning sheet”) to obtain a scanned image. A data SID is generated. The scanned image data SID is RGB bitmap data having multiple gradations (for example, 256 gradations).

スキャン画像データＳＩＤによって表わされるスキャン画像ＳＩ（即ちスキャン対象シートに表現されている画像）は、白色の背景を有すると共に、テキストオブジェクトＴＯＢと写真オブジェクトＰＯＢとを含む。テキストオブジェクトＴＯＢは、黒色の複数個の文字「Ａ〜Ｍ」によって構成される３行の文字列を含む。複数個の文字「Ａ〜Ｍ」のうちの３個の文字「ＦＧＨ」は、黒色の修飾線によって修飾されている。修飾線は、３個の文字「ＦＧＨ」の下側において、横方向に沿って伸びる一重線（即ち下線）である。なお、文字及び修飾線の色は、黒色とは異なる色（例えば赤色）でもよい。写真オブジェクトＰＯＢは、文字を含まず、複数色によって構成される写真を含む。 A scan image SI represented by the scan image data SID (that is, an image represented on the scan target sheet) has a white background and includes a text object TOB and a photo object POB. The text object TOB includes a three-line character string composed of a plurality of black characters “A to M”. Of the plurality of characters “A to M”, three characters “FGH” are modified by a black modifier line. The modification line is a single line (ie, underline) extending along the horizontal direction below the three characters “FGH”. In addition, the color (for example, red) different from black may be sufficient as the color of a character and a modification line. The photo object POB does not include characters, but includes a photo composed of a plurality of colors.

なお、本実施例の各図では、便宜上、テキストオブジェクトＴＯＢを構成する各文字列が、規則的な順序で並ぶアルファベット「Ａ〜Ｍ」によって表現されているが、実際には、各文字列は、センテンスを構成している。各文字列（即ち１行の文字列）では、スキャン画像ＳＩ内の横方向の左側から右側に向かってセンテンスが進む。また、３行の文字列「Ａ〜Ｍ」では、スキャン画像ＳＩ内の縦方向の上側から下側に向かってセンテンスが進む。なお、以下のいずれの画像（例えば後述の処理済み画像ＰＩ）においても、１行の文字列を構成する複数個の文字が並ぶ方向、当該方向に直交する方向を、それぞれ、「横方向」、「縦方向」と呼ぶ。また、左側から右側に向かってセンテンスが進むことから、横方向の左端、横方向の右端のことを、それぞれ、「先端」、「後端」と呼ぶ。 In each figure of this embodiment, for convenience, each character string constituting the text object TOB is represented by alphabets “A to M” arranged in a regular order. Constitutes a sentence. In each character string (that is, one line of character string), the sentence advances from the left side in the horizontal direction in the scan image SI toward the right side. In the three-line character strings “A to M”, sentences progress from the upper side to the lower side in the vertical direction in the scan image SI. In any of the following images (for example, a processed image PI described later), a direction in which a plurality of characters constituting one line of character string are arranged and a direction orthogonal to the direction are respectively referred to as “lateral direction”, This is called “vertical direction”. Since sentences progress from the left side to the right side, the left end in the horizontal direction and the right end in the horizontal direction are referred to as “front end” and “rear end”, respectively.

多機能機１０は、ユーザから通常カラーコピー機能の実行指示が与えられる場合には、スキャン画像データＳＩＤを利用して、ユーザによって設定されるコピー倍率に応じて、画像をシート（以下では「印刷対象シート」と呼ぶ）に印刷する。例えば、コピー倍率が等倍である場合には、多機能機１０は、スキャン対象シートに表現されている画像と同じサイズを有する画像を印刷対象シートに印刷する。また、例えば、コピー倍率が画像の拡大を示す倍率である場合には、多機能機１０は、スキャン対象シートに表現されている画像よりも大きいサイズを有する画像を印刷対象シートに印刷する。この場合、例えば、Ａ４サイズのスキャン対象シートに表現されている画像が拡大されて、Ａ３サイズの印刷対象シートに印刷される。この結果、２個のオブジェクトＴＯＢ，ＰＯＢの全てが拡大されて表現されている画像が印刷対象シートに印刷される。 When a normal color copy function execution instruction is given from the user, the multi-function device 10 uses the scanned image data SID to print an image on a sheet (hereinafter “print”) according to the copy magnification set by the user. To the target sheet). For example, when the copy magnification is equal, the multi-function device 10 prints an image having the same size as the image expressed on the scan target sheet on the print target sheet. Further, for example, when the copy magnification is a magnification indicating the enlargement of the image, the multi-function device 10 prints an image having a size larger than the image expressed on the scan target sheet on the print target sheet. In this case, for example, the image expressed on the A4 size scan target sheet is enlarged and printed on the A3 size print target sheet. As a result, an image in which all of the two objects TOB and POB are enlarged and printed is printed on the print target sheet.

一方、多機能機１０は、ユーザから文字拡大カラーコピー機能の実行指示が与えられる場合には、インターネット４を介して、スキャン画像データＳＩＤを画像処理サーバ５０に送信する。これにより、多機能機１０は、インターネット４を介して、画像処理サーバ５０から処理済み画像データＰＩＤを受信し、処理済み画像データＰＩＤによって表わされる処理済み画像ＰＩを印刷対象シートに印刷する。特に、多機能機１０は、スキャン対象シートと同じサイズ（例えばＡ４サイズ）を有する印刷対象シートに処理済み画像ＰＩを印刷する。 On the other hand, the multi-function device 10 transmits the scan image data SID to the image processing server 50 via the Internet 4 when an instruction to execute the character enlargement color copy function is given from the user. Accordingly, the multi-function device 10 receives the processed image data PID from the image processing server 50 via the Internet 4 and prints the processed image PI represented by the processed image data PID on the print target sheet. In particular, the multi-function device 10 prints the processed image PI on a print target sheet having the same size (for example, A4 size) as the scan target sheet.

処理済み画像ＰＩでは、スキャン画像ＳＩと比べて、写真オブジェクトＰＯＢが拡大されずに、テキストオブジェクトＴＯＢが拡大されて表現されている。従って、スキャン画像ＳＩ内の各文字のサイズが小さい場合でも、処理済み画像ＰＩでは、各文字のサイズが大きくなるので、ユーザは、処理済み画像ＰＩ内の各文字を容易に認識することができる。また、処理済み画像ＰＩ内の各文字列「Ａ〜Ｍ」のうちの先頭行の文字列「Ａ〜Ｆ」の文字数（即ち「６」）は、スキャン画像ＳＩ内の各文字列「Ａ〜Ｍ」のうちの先頭行の文字列「Ａ〜Ｅ」の文字数（即ち「５」）とは異なる。また、スキャン画像ＳＩと同様に、処理済み画像ＰＩ内の３個の文字「ＦＧＨ」は、修飾線によって修飾されている。 In the processed image PI, the photographic object POB is not enlarged but the text object TOB is enlarged as compared with the scanned image SI. Therefore, even if the size of each character in the scanned image SI is small, the size of each character is large in the processed image PI, so that the user can easily recognize each character in the processed image PI. . In addition, the number of characters (that is, “6”) of the character string “A to F” in the first row among the character strings “A to M” in the processed image PI is the character string “A to F” in the scanned image SI. It is different from the number of characters of the character string “A to E” in the first line of “M” (ie, “5”). Similarly to the scanned image SI, the three characters “FGH” in the processed image PI are modified by a modification line.

（画像処理サーバ５０の構成）
画像処理サーバ５０は、多機能機１０から受信されるスキャン画像データＳＩＤに対して画像処理を実行して、処理済み画像データＰＩＤを生成し、当該処理済み画像データＰＩＤを多機能機１０に送信する。画像処理サーバ５０は、ネットワークインターフェース５２と、制御部６０と、を備える。ネットワークインターフェース５２は、インターネット４に接続される。制御部６０は、ＣＰＵ６２とメモリ６４とを備える。ＣＰＵ６２は、メモリ６４に格納されているプログラム６６に従って、様々な処理（即ち図２等の処理）を実行するプロセッサである。 (Configuration of the image processing server 50)
The image processing server 50 performs image processing on the scanned image data SID received from the multi-function device 10, generates processed image data PID, and transmits the processed image data PID to the multi-function device 10. To do. The image processing server 50 includes a network interface 52 and a control unit 60. The network interface 52 is connected to the Internet 4. The control unit 60 includes a CPU 62 and a memory 64. The CPU 62 is a processor that executes various processes (that is, the processes in FIG. 2 and the like) according to a program 66 stored in the memory 64.

（画像処理サーバ５０によって実行される各処理；図２）
続いて、図２を参照して、画像処理サーバ５０のＣＰＵ６２によって実行される各処理の内容を説明する。ＣＰＵ６２は、インターネット４を介して、多機能機１０からスキャン画像データＳＩＤを受信する場合に、図２の処理を開始する。 (Each process executed by the image processing server 50; FIG. 2)
Next, the contents of each process executed by the CPU 62 of the image processing server 50 will be described with reference to FIG. When the CPU 62 receives the scan image data SID from the multi-function device 10 via the Internet 4, the CPU 62 starts the process of FIG.

Ｓ１００では、ＣＰＵ６２は、文字列解析処理（後述の図３参照）を実行して、スキャン画像ＳＩ内の３行の文字列「Ａ〜Ｍ」を含むテキストオブジェクト領域ＴＯＡを決定する。そして、ＣＰＵ６２は、テキストオブジェクト領域ＴＯＡ内の３個の帯状領域ＬＡを決定する。 In S100, the CPU 62 executes a character string analysis process (see FIG. 3 to be described later), and determines a text object area TOA including three lines of character strings “A to M” in the scanned image SI. Then, the CPU 62 determines three belt-like areas LA in the text object area TOA.

Ｓ２００では、ＣＰＵ６２は、結合処理（後述の図６参照）を実行して、結合画像ＣＩを表わす結合画像データを生成する。結合画像ＣＩは、３個の帯状領域ＬＡに含まれる３行の文字列が横方向に沿って直線状に結合（即ち連結）された１行の結合文字列「Ａ〜Ｍ」を含む。 In S200, the CPU 62 executes a combination process (see FIG. 6 described later) to generate combined image data representing the combined image CI. The combined image CI includes one line of combined character strings “A to M” in which three lines of character strings included in the three strip-shaped areas LA are linearly combined (that is, connected) along the horizontal direction.

Ｓ３００では、ＣＰＵ６２は、目標領域決定処理を実行して、スキャン画像ＳＩ内において目標領域ＴＡを決定する。具体的には、ＣＰＵ６２は、まず、テキストオブジェクト領域ＴＯＡの左上の頂点に一致する左上の頂点を有するスペース領域を決定する。スペース領域は、テキストオブジェクト領域ＴＯＡのサイズ（即ち面積）よりも大きいサイズを有すると共に、他のオブジェクト領域（例えば写真オブジェクトＰＯＢを含むオブジェクト領域）に重複しない。そして、ＣＰＵ６２は、スペース領域のアスペクト比に等しいアスペクト比を有する目標領域ＴＡをスペース領域内に決定する。ここで、目標領域ＴＡのサイズ（即ち面積）は、テキストオブジェクト領域ＴＯＡのサイズ（即ち面積）のα倍以下の最大のサイズである。αは、１より大きい値であり、例えば１．４である。目標領域ＴＡの位置は、目標領域ＴＡの左上の頂点がテキストオブジェクト領域ＴＯＡの左上の頂点に一致するように設定される。目標領域ＴＡのアスペクト比は、通常、テキストオブジェクト領域ＴＯＡのアスペクト比とは異なる。スキャン画像ＳＩ内の目標領域ＴＡは、処理済み画像ＰＩ内の目標領域ＴＡ（Ｓ５００の処理済み画像ＰＩ参照）に一致する。従って、Ｓ３００の処理は、処理済み画像ＰＩ内の目標領域ＴＡを決定する処理に等しい。処理済み画像ＰＩ内の目標領域ＴＡは、拡大されて表現される文字列「Ａ〜Ｍ」が配置されるべき領域である。 In S300, the CPU 62 executes a target area determination process to determine a target area TA in the scan image SI. Specifically, the CPU 62 first determines a space area having an upper left vertex that matches the upper left vertex of the text object area TOA. The space area has a size larger than the size (ie, area) of the text object area TOA and does not overlap with other object areas (for example, an object area including the photo object POB). Then, the CPU 62 determines a target area TA having an aspect ratio equal to the aspect ratio of the space area in the space area. Here, the size (ie, area) of the target area TA is the maximum size that is α times or less the size (ie, area) of the text object area TOA. α is a value larger than 1, for example, 1.4. The position of the target area TA is set so that the upper left vertex of the target area TA matches the upper left vertex of the text object area TOA. The aspect ratio of the target area TA is usually different from the aspect ratio of the text object area TOA. The target area TA in the scanned image SI matches the target area TA in the processed image PI (see the processed image PI in S500). Therefore, the process of S300 is equivalent to the process of determining the target area TA in the processed image PI. The target area TA in the processed image PI is an area in which the character strings “A to M” expressed in an enlarged manner are to be arranged.

Ｓ４００では、ＣＰＵ６２は、再配置処理（後述の図９参照）を実行する。まず、ＣＰＵ６２は、再配置領域ＲＡを決定する。そして、ＣＰＵ６２は、結合画像ＣＩを表わす結合画像データを利用して、複数個の文字「Ａ〜Ｍ」を再配置領域ＲＡ内に再配置することによって、再配置画像ＲＩを表わす再配置画像データを生成する。 In S400, the CPU 62 executes a rearrangement process (see FIG. 9 described later). First, the CPU 62 determines the rearrangement area RA. Then, the CPU 62 uses the combined image data representing the combined image CI to rearrange a plurality of characters “A to M” in the rearranged area RA, thereby rearranging the rearranged image data representing the rearranged image RI. Is generated.

Ｓ５００では、ＣＰＵ６２は、再配置画像ＲＩを表わす再配置画像データを拡大して、拡大画像データを生成する。そして、ＣＰＵ６２は、拡大画像データを利用して、処理済み画像ＰＩを表わす処理済み画像データＰＩＤを生成する。処理済み画像ＰＩでは各文字が拡大されて表現されるが、処理済み画像データＰＩＤは、スキャン画像データＳＩＤと同じ画素数を有する。 In S500, the CPU 62 enlarges the rearranged image data representing the rearranged image RI to generate enlarged image data. Then, the CPU 62 generates processed image data PID representing the processed image PI using the enlarged image data. In the processed image PI, each character is expressed in an enlarged manner, but the processed image data PID has the same number of pixels as the scanned image data SID.

Ｓ６００では、ＣＰＵ６２は、インターネット４を介して、処理済み画像データＰＩＤを多機能機１０に送信する。この結果、処理済み画像データＰＩＤによって表わされる処理済み画像ＰＩが多機能機１０によって対象印刷シートに印刷される。 In S 600, the CPU 62 transmits the processed image data PID to the multi-function device 10 via the Internet 4. As a result, the processed image PI represented by the processed image data PID is printed on the target print sheet by the multi-function device 10.

（文字列解析処理；図３）
続いて、図３を参照して、図２のＳ１００で実行される文字列解析処理の内容を説明する。Ｓ１１０では、ＣＰＵ６２は、スキャン画像データＳＩＤに対して二値化処理を実行して、スキャン画像データＳＩＤと同じ画素数を有する二値データＢＤ（図３では一部のみが示されている）を生成する。ＣＰＵ６２は、まず、スキャン画像データＳＩＤを利用して、スキャン画像ＳＩの背景色（本実施例では白色）を決定する。具体的には、ＣＰＵ６２は、スキャン画像データＳＩＤ内の複数個の画素の画素値の頻度の分布を示すヒストグラムを生成する。そして、ＣＰＵ６２は、当該ヒストグラムを利用して、最高の頻度を有する画素値（以下では「最高頻度画素値」と呼ぶ）を特定することによって、背景色を決定する。次いで、ＣＰＵ６２は、スキャン画像データＳＩＤ内の複数個の画素のそれぞれについて、当該画素の画素値が最高頻度画素値に一致する場合には、当該画素に対応する位置に存在する二値データＢＤ内の画素の画素値として「０」を割り当て、当該画素の画素値が最高頻度画素値に一致しない場合には、当該画素に対応する位置に存在する二値データＢＤ内の画素の画素値として「１」を割り当てる。この結果、二値データＢＤでは、テキストオブジェクトＴＯＢに含まれる各文字（例えば「Ａ」，「Ｂ」）及び修飾線を表わす各画素が画素値「１」を示し、写真オブジェクトＰＯＢを表わす各画素が画素値「１」を示し、それ以外の各画素（即ち背景を表わす画素）が画素値「０」を示す。なお、以下では、二値データＢＤ内の画素値「１」を示す画素、画素値「０」を示す画素のことを、それぞれ、「ＯＮ画素」、「ＯＦＦ画素」と呼ぶ。 (Character string analysis processing; Fig. 3)
Next, the contents of the character string analysis process executed in S100 of FIG. 2 will be described with reference to FIG. In S110, the CPU 62 executes binarization processing on the scanned image data SID, and outputs binary data BD (only part of which is shown in FIG. 3) having the same number of pixels as the scanned image data SID. Generate. The CPU 62 first determines the background color (white in the present embodiment) of the scan image SI using the scan image data SID. Specifically, the CPU 62 generates a histogram indicating the frequency distribution of pixel values of a plurality of pixels in the scan image data SID. Then, the CPU 62 determines the background color by specifying the pixel value having the highest frequency (hereinafter referred to as “the highest frequency pixel value”) using the histogram. Next, for each of the plurality of pixels in the scan image data SID, the CPU 62, when the pixel value of the pixel matches the highest frequency pixel value, in the binary data BD existing at the position corresponding to the pixel. When “0” is assigned as the pixel value of the pixel and the pixel value of the pixel does not match the highest frequency pixel value, the pixel value of the pixel in the binary data BD existing at the position corresponding to the pixel is “ 1 ”is assigned. As a result, in the binary data BD, each character (for example, “A”, “B”) included in the text object TOB and each pixel representing the modification line indicate a pixel value “1”, and each pixel representing the photo object POB. Indicates a pixel value “1”, and other pixels (that is, pixels representing the background) indicate a pixel value “0”. Hereinafter, the pixel indicating the pixel value “1” and the pixel indicating the pixel value “0” in the binary data BD are referred to as “ON pixel” and “OFF pixel”, respectively.

Ｓ１２０では、ＣＰＵ６２は、Ｓ１１０で生成された二値データＢＤに対してラべリング処理を実行して、二値データＢＤと同じ画素数を有するラベルデータＬＤ（図３では一部のみが示されている）を生成する。具体的には、ＣＰＵ６２は、二値データＢＤ内の複数個のＯＮ画素を２個以上のＯＮ画素群に分けて、当該２個以上のＯＮ画素群のそれぞれに異なる画素値（例えば「１」、「２」等）を割り当てる。１個のＯＮ画素群は、互いに隣接する２個以上のＯＮ画素によって構成される。即ち、ＣＰＵ６２は、ラべリング処理の対象の１個のＯＮ画素に隣接する８個の隣接画素の中に１個以上のＯＮ画素が含まれる場合には、当該対象の１個のＯＮ画素と、８個の隣接画素のうちの１個以上のＯＮ画素と、を同じＯＮ画素群として区分する（即ちグループ化する）。ＣＰＵ６２は、ラべリング処理の対象のＯＮ画素を変えながら各ＯＮ画素のグループ化を順次実行することによって、２個以上のＯＮ画素群を決定する。例えば、図３のラベルデータＬＤでは、文字「Ａ」を表わす各ＯＮ画素（即ち１個のＯＮ画素群）に画素値「１」が割り当てられており、文字「Ｂ」を表わす各ＯＮ画素（即ち他の１個のＯＮ画素群）に画素値「２」が割り当てられている。 In S120, the CPU 62 performs a labeling process on the binary data BD generated in S110, and the label data LD having the same number of pixels as the binary data BD (only a part is shown in FIG. 3). Is generated). Specifically, the CPU 62 divides a plurality of ON pixels in the binary data BD into two or more ON pixel groups, and each of the two or more ON pixel groups has a different pixel value (for example, “1”). , “2”, etc.). One ON pixel group is composed of two or more ON pixels adjacent to each other. That is, when one or more ON pixels are included in eight adjacent pixels adjacent to one ON pixel to be labeled, the CPU 62 determines that the target ON pixel , One or more ON pixels among the eight adjacent pixels are divided (ie, grouped) into the same ON pixel group. The CPU 62 determines two or more ON pixel groups by sequentially executing grouping of each ON pixel while changing the ON pixels to be labeled. For example, in the label data LD of FIG. 3, the pixel value “1” is assigned to each ON pixel (that is, one ON pixel group) representing the character “A”, and each ON pixel representing the character “B” ( That is, the pixel value “2” is assigned to the other one ON pixel group).

Ｓ１３０では、ＣＰＵ６２は、Ｓ１２０で生成されたラベルデータＬＤを利用して、上記の各ＯＮ画素群に対応する各単位領域を決定する。各単位領域は、対応する１個のＯＮ画素群に外接する矩形の領域である。ＣＰＵ６２は、例えば、図３のラベルデータＬＤを利用する場合には、画素値「１」が割り当てられているＯＮ画素群に外接する単位領域ＲＥ１（即ち文字「Ａ」に対応する単位領域）と、画素値「２」が割り当てられているＯＮ画素群に外接する単位領域ＲＥ１（即ち文字「Ｂ」に対応する単位領域）と、を決定する。より具体的には、ＣＰＵ６２は、スキャン画像ＳＩの中から、１３個の文字「Ａ」〜「Ｍ」に対応する１３個の単位領域と、１個の修飾線に対応する１個の単位領域と、１個の写真オブジェクトＰＯＢに対応する１個の単位領域と、を含む１５個の単位領域を決定する。上記の単位領域の決定は、当該単位領域の各頂点を構成する各画素の位置をメモリ６４に記憶することによって実行される。ただし、以下では、「領域（又は位置）の決定」に関する説明において、画素の位置をメモリ６４に記憶することに関する説明を省略する。 In S130, the CPU 62 determines each unit area corresponding to each of the ON pixel groups using the label data LD generated in S120. Each unit area is a rectangular area circumscribing one corresponding ON pixel group. For example, when the label data LD in FIG. 3 is used, the CPU 62 circumscribes the unit area RE1 circumscribing the ON pixel group to which the pixel value “1” is assigned (that is, the unit area corresponding to the character “A”) and The unit region RE1 circumscribing the ON pixel group to which the pixel value “2” is assigned (that is, the unit region corresponding to the character “B”) is determined. More specifically, the CPU 62 selects 13 unit areas corresponding to 13 characters “A” to “M” and 1 unit area corresponding to one modifier line from the scanned image SI. And 15 unit areas including one unit area corresponding to one photo object POB are determined. The determination of the unit area is executed by storing the position of each pixel constituting each vertex of the unit area in the memory 64. However, in the description below regarding “determination of region (or position)”, description regarding storing the pixel position in the memory 64 is omitted.

Ｓ１４０では、ＣＰＵ６２は、Ｓ１３０で決定された単位領域を利用して、スキャン画像ＳＩ内のオブジェクト領域を決定する。具体的には、ＣＰＵ６２は、１５個の単位領域を複数個の単位領域群に区分し、各単位領域群に対応する各オブジェクト領域を決定する。１個の単位領域群は、近傍に存在する１個以上の単位領域によって構成される。ＣＰＵ６２は、２個の単位領域の間の距離が所定の距離未満である場合に、当該２個の単位領域を同じ単位領域群に区分する。上記の所定の距離は、スキャン画像データＳＩＤの解像度に応じて予め決められている。例えば、本実施例では、スキャン画像データＳＩＤが３００ｄｐｉの解像度を有しており、３００ｄｐｉの解像度に対応する上記の所定の距離は、１０画素である。そして、図３のラベルデータＬＤでは、文字「Ａ」に対応する単位領域ＲＥ１と、文字「Ｂ」に対応する単位領域ＲＥ２と、の間の距離は、３画素である。従って、ＣＰＵ６２は、単位領域ＲＥ１と単位領域ＲＥ２とを同じ単位領域群に区分する。これにより、ＣＰＵ６２は、近傍に存在する各文字をグループ化することができる。より具体的には、ＣＰＵ６２は、スキャン画像ＳＩについて、テキストオブジェクトＴＯＢ内の１３個の文字「Ａ」〜「Ｍ」と１個の修飾線とに対応する１４個の単位領域を含む単位領域群と、１個の写真オブジェクトＰＯＢに対応する１個の単位領域を含む単位領域群と、を含む２個の単位領域群を決定する。そして、ＣＰＵ６２は、２個の単位領域群のそれぞれについて、当該単位領域群に外接する矩形の領域をオブジェクト領域として決定する。このように、ＣＰＵ６２は、ラベルデータＬＤ内のオブジェクト領域を決定することによって、スキャン画像ＳＩ内において、テキストオブジェクトＴＯＢ内の１３個の文字「Ａ」〜「Ｍ」と１個の修飾線とを含むオブジェクト領域ＴＯＡと、写真オブジェクトＰＯＢを含むオブジェクト領域ＰＯＡと、を含む２個のオブジェクト領域ＴＯＡ，ＰＯＡを決定する。 In S140, the CPU 62 determines an object area in the scan image SI using the unit area determined in S130. Specifically, the CPU 62 divides the 15 unit areas into a plurality of unit area groups, and determines each object area corresponding to each unit area group. One unit region group is composed of one or more unit regions existing in the vicinity. When the distance between the two unit areas is less than the predetermined distance, the CPU 62 divides the two unit areas into the same unit area group. The predetermined distance is determined in advance according to the resolution of the scan image data SID. For example, in this embodiment, the scan image data SID has a resolution of 300 dpi, and the predetermined distance corresponding to the resolution of 300 dpi is 10 pixels. In the label data LD of FIG. 3, the distance between the unit region RE1 corresponding to the character “A” and the unit region RE2 corresponding to the character “B” is 3 pixels. Therefore, the CPU 62 divides the unit area RE1 and the unit area RE2 into the same unit area group. Thereby, CPU62 can group each character which exists in the vicinity. More specifically, for the scan image SI, the CPU 62 includes a unit region group including 14 unit regions corresponding to 13 characters “A” to “M” in the text object TOB and one modifier line. And two unit region groups including a unit region group including one unit region corresponding to one photo object POB. Then, for each of the two unit area groups, the CPU 62 determines a rectangular area circumscribing the unit area group as an object area. As described above, the CPU 62 determines the object area in the label data LD, thereby generating the 13 characters “A” to “M” in the text object TOB and one modifier line in the scanned image SI. Two object areas TOA and POA including the object area TOA including the object area POA including the photographic object POB are determined.

Ｓ１５０では、ＣＰＵ６２は、Ｓ１４０で決定された２個のオブジェクト領域ＴＯＡ，ＰＯＡのそれぞれについて、当該オブジェクト領域の種類を決定する。具体的には、ＣＰＵ６２は、各オブジェクト領域ＴＯＡ，ＰＯＡが、文字を含むテキストオブジェクト領域（以下では単に「テキスト領域」と呼ぶ）であるのか否かを判断する。ＣＰＵ６２は、まず、スキャン画像データＳＩＤのうち、オブジェクト領域ＴＯＡを表わす部分画像データを構成する複数個の画素の画素値の頻度の分布を示すヒストグラムを生成する。そして、ＣＰＵ６２は、当該ヒストグラムを利用して、頻度がゼロより高い画素値の数（即ち、オブジェクト領域ＴＯＡで利用されている色の数）を算出する。ＣＰＵ６２は、算出済みの数が所定数（例えば「１０」）未満である場合には、オブジェクト領域ＴＯＡがテキスト領域であると判断し、算出済みの数が上記の所定数以上である場合には、オブジェクト領域ＴＯＡがテキスト領域でないと判断する。オブジェクト領域ＴＯＡは、黒色の文字「Ａ」〜「Ｍ」と、黒色の修飾線と、白色の背景と、を含む。従って、オブジェクト領域ＴＯＡに対応するヒストグラムでは、通常、黒色を示す画素値と白色を示す画素値とを含む２個の画素値のみの頻度がゼロより高い。このために、ＣＰＵ６２は、オブジェクト領域ＴＯＡがテキスト領域であると判断する。一方、例えば、写真オブジェクトＰＯＢでは、通常、１０色以上の色が利用されている。従って、オブジェクト領域ＰＯＡに対応するヒストグラムでは、通常、頻度がゼロより高い画素値の数が上記の所定数以上になる。このために、ＣＰＵ６２は、オブジェクト領域ＰＯＡが、テキスト領域ではなく、写真オブジェクト領域であると判断する。 In S150, the CPU 62 determines the type of the object area for each of the two object areas TOA and POA determined in S140. Specifically, the CPU 62 determines whether or not each object area TOA, POA is a text object area including characters (hereinafter simply referred to as “text area”). First, the CPU 62 generates a histogram indicating the frequency distribution of the pixel values of a plurality of pixels constituting the partial image data representing the object area TOA in the scan image data SID. Then, the CPU 62 uses the histogram to calculate the number of pixel values whose frequency is higher than zero (that is, the number of colors used in the object area TOA). When the calculated number is less than a predetermined number (for example, “10”), the CPU 62 determines that the object area TOA is a text area, and when the calculated number is equal to or greater than the predetermined number. The object area TOA is determined not to be a text area. The object area TOA includes black characters “A” to “M”, a black decoration line, and a white background. Therefore, in the histogram corresponding to the object region TOA, the frequency of only two pixel values including the pixel value indicating black and the pixel value indicating white is usually higher than zero. For this reason, the CPU 62 determines that the object area TOA is a text area. On the other hand, for example, in the photo object POB, ten or more colors are usually used. Therefore, in the histogram corresponding to the object area POA, the number of pixel values having a frequency higher than zero is usually greater than or equal to the predetermined number. Therefore, the CPU 62 determines that the object area POA is not a text area but a photo object area.

Ｓ１６０では、ＣＰＵ６２は、Ｓ１５０で決定されたテキスト領域ＴＯＡに対して帯状領域決定処理（後述の図４参照）を実行して、テキスト領域ＴＯＡ内の各帯状領域を決定する。ただし、ＣＰＵ６２は、写真オブジェクト領域ＰＯＡに対して帯状領域決定処理を実行しない。 In S160, the CPU 62 executes a band-shaped area determination process (see FIG. 4 described later) on the text area TOA determined in S150 to determine each band-shaped area in the text area TOA. However, the CPU 62 does not execute the band-shaped area determination process for the photo object area POA.

Ｓ１８０では、ＣＰＵ６２は、Ｓ１６０で決定された各帯状領域に対して修飾物解析処理（後述の図５参照）を実行して、各帯状領域が、文字列を含む文字列帯状領域であるのか、修飾線を含む修飾物帯状領域であるのか、を決定する。Ｓ１８０が終了すると、図３の処理が終了する。 In S180, the CPU 62 executes a modification analysis process (see FIG. 5 described later) for each band-shaped area determined in S160, and whether each band-shaped area is a character string band-shaped area including a character string. It is determined whether the region is a modified belt-like region including a modification line. When S180 ends, the process of FIG. 3 ends.

（帯状領域決定処理；図４）
続いて、図４を参照して、図３のＳ１６０で実行される帯状領域決定処理の内容を説明する。以下では、スキャン画像ＳＩ内のテキスト領域ＴＯＡを例として、図４の処理の内容を説明する。スキャン画像ＳＩ内に複数個のテキストオブジェクトが含まれる場合には、テキストオブジェクト毎（即ちテキスト領域毎）に図４の処理が実行される。 (Strip-like region determination process; FIG. 4)
Next, with reference to FIG. 4, the content of the band-shaped area determination process executed in S160 of FIG. 3 will be described. In the following, the contents of the process of FIG. 4 will be described using the text area TOA in the scanned image SI as an example. When a plurality of text objects are included in the scanned image SI, the process of FIG. 4 is executed for each text object (that is, for each text area).

Ｓ１６２では、ＣＰＵ６２は、テキスト領域ＴＯＡに対応する射影ヒストグラムを生成する。当該射影ヒストグラムは、二値データＢＤ（Ｓ１１０参照）を構成する複数個の画素のうち、テキスト領域ＴＯＡを表わす各画素を横方向に射影する場合におけるＯＮ画素（即ち「１」を示す画素）の頻度の分布を示す。換言すると、当該射影ヒストグラムは、スキャン画像ＳＩＤを構成する複数個の画素のうち、テキスト領域ＴＯＡを表わす各画素を横方向に射影する場合における文字列及び修飾線を構成する画素（即ち黒色を表わす画素）の頻度の分布を示す。当該射影ヒストグラムでは、文字列及び修飾線が、頻度がゼロより高い範囲（以下では「高頻度範囲」と呼ぶ）で表わされる。また、当該射影ヒストグラムでは、２行の文字列の間の行間（例えば「Ａ〜Ｅ」と「Ｆ〜Ｊ」の間の行間）、及び、文字列と修飾線との間の行間（例えば「Ｆ〜Ｊ」と修飾線の間の行間）が、頻度がゼロである範囲で表わされる。 In S162, the CPU 62 generates a projection histogram corresponding to the text area TOA. The projection histogram is an ON pixel (that is, a pixel indicating “1”) in a case where each pixel representing the text area TOA is projected in the horizontal direction among a plurality of pixels constituting the binary data BD (see S110). Shows the frequency distribution. In other words, the projection histogram corresponds to the pixels constituting the character string and the modification line (that is, black) when the pixels representing the text area TOA are projected in the horizontal direction among the plurality of pixels constituting the scan image SID. (Pixel) frequency distribution. In the projection histogram, the character string and the modification line are represented by a range in which the frequency is higher than zero (hereinafter referred to as “high frequency range”). Further, in the projection histogram, a line interval between two character strings (for example, a line interval between “A to E” and “F to J”) and a line interval between a character string and a modifier line (for example, “ F-J "and the line spacing between the modifier lines) is represented in the range where the frequency is zero.

Ｓ１６４では、ＣＰＵ６２は、Ｓ１６２で生成された射影ヒストグラムを利用して、１個以上の高頻度範囲に対応する１個以上の帯状領域を決定する。１個の帯状領域の縦方向の長さ（即ち縦方向の画素数）は、射影ヒストグラムにおける当該帯状領域に対応する高頻度範囲の縦方向の長さに等しい。また、１個の帯状領域の横方向の長さ（即ち横方向の画素数）は、テキスト領域ＴＯＡの横方向の長さに等しい。この結果、ＣＰＵ６２は、テキスト領域ＴＯＡの中から、文字列「Ａ〜Ｅ」を含む帯状領域ＬＡ１１と、文字列「Ｆ〜Ｊ」を含む帯状領域ＬＡ１２と、修飾線を含む帯状領域ＬＡ１３と、文字列「Ｋ〜Ｍ」を含む帯状領域ＬＡ１４と、を含む４個の帯状領域ＬＡ１１〜ＬＡ１４を決定する。 In S164, the CPU 62 determines one or more band-like regions corresponding to one or more high-frequency ranges using the projection histogram generated in S162. The length in the vertical direction (that is, the number of pixels in the vertical direction) of one band-like region is equal to the length in the vertical direction of the high-frequency range corresponding to the band-like region in the projection histogram. Further, the length in the horizontal direction of one band-like region (that is, the number of pixels in the horizontal direction) is equal to the length in the horizontal direction of the text region TOA. As a result, the CPU 62 selects, from the text area TOA, a band-shaped area LA11 including the character string “A to E”, a band-shaped area LA12 including the character string “F to J”, a band-shaped area LA13 including the modification line, Four belt-like regions LA11 to LA14 including the belt-like region LA14 including the character string “K to M” are determined.

続いて、ＣＰＵ６２は、Ｓ１６６〜Ｓ１７４の処理を実行して、各帯状領域ＬＡ１１〜ＬＡ１４に対応する各基準位置を決定する。基準位置は、図２のＳ２００の結合処理において、各帯状領域に含まれる各文字列を結合するための基準となる位置である。 Subsequently, the CPU 62 executes the processes of S166 to S174 to determine each reference position corresponding to each of the belt-like areas LA11 to LA14. The reference position is a position serving as a reference for combining the character strings included in the band-like regions in the combining process of S200 of FIG.

Ｓ１６６では、ＣＰＵ６２は、Ｓ１６４で決定された４個の帯状領域ＬＡ１１〜ＬＡ１４のうちの１個の帯状領域（以下では「対象帯状領域」と呼ぶ）を処理対象として決定する。以下では、帯状領域ＬＡ１１が対象帯状領域として決定される場合を例として説明する。 In S166, the CPU 62 determines one band-shaped area (hereinafter referred to as “target band-shaped area”) among the four band-shaped areas LA11 to LA14 determined in S164 as a processing target. Hereinafter, a case where the band-shaped area LA11 is determined as the target band-shaped area will be described as an example.

Ｓ１６８では、ＣＰＵ６２は、対象帯状領域ＬＡ１１の縦方向の全範囲ＡＲの中から、縦方向の３画素分の評価範囲を設定する。対象帯状領域ＬＡ１１に関する１回目のＳ１６８では、ＣＰＵ６２は、３画素のうちの最も上の画素が対象帯状領域ＬＡ１１の縦方向の全範囲ＡＲの中間位置に存在するように、１回目の評価範囲を設定する。対象帯状領域ＬＡ１１に関する２回以降のＳ１６８では、ＣＰＵ６２は、前回の評価範囲から１画素だけ下側にずれるように、今回の評価範囲を設定する。なお、変形例では、評価範囲は、縦方向の３画素分の範囲でなくてもよく、縦方向の１画素分又は２画素分の範囲であってもよいし、縦方向の４画素分以上の範囲であってもよい。 In S168, the CPU 62 sets an evaluation range for three pixels in the vertical direction from the entire vertical range AR of the target strip area LA11. In S168 for the first time regarding the target band-shaped area LA11, the CPU 62 sets the first evaluation range so that the uppermost pixel of the three pixels exists at an intermediate position of the entire vertical range AR of the target band-shaped area LA11. Set. In S168 after the second time regarding the target band-shaped region LA11, the CPU 62 sets the current evaluation range so as to be shifted downward by one pixel from the previous evaluation range. In the modified example, the evaluation range may not be a range corresponding to three pixels in the vertical direction, may be a range corresponding to one pixel or two pixels in the vertical direction, or more than four pixels in the vertical direction. It may be a range.

Ｓ１７０では、ＣＰＵ６２は、今回の評価範囲について、合計下辺長さを算出する。合計下辺長さは、対象帯状領域ＬＡ１１内の５個の文字「Ａ」〜「Ｅ」に対応する５個の単位領域（図３のＳ１３０で決定済み）のうちの１個以上の単位領域の下辺ＸＡ〜ＸＥが今回の評価範囲内に存在する場合に、当該１個以上の単位領域の下辺の長さの和である。図４の例では、１回目及び２回目の評価範囲では、１個の単位領域の下辺も存在しないので、ＣＰＵ６２は、合計下辺長さとして「０」を決定する。そして、ｐ回目の評価範囲では、５個の下辺ＸＡ〜ＸＥの全てが存在するので、ＣＰＵ６２は、５個の下辺ＸＡ〜ＸＥの長さの和である合計下辺長さとして「１」以上の値を算出する。 In S170, the CPU 62 calculates the total lower side length for the current evaluation range. The total lower side length is the length of one or more unit regions among the five unit regions (determined in S130 of FIG. 3) corresponding to the five characters “A” to “E” in the target strip-shaped region LA11. When the lower sides XA to XE are present in the current evaluation range, it is the sum of the lengths of the lower sides of the one or more unit regions. In the example of FIG. 4, since there is no lower side of one unit area in the first and second evaluation ranges, the CPU 62 determines “0” as the total lower side length. In the p-th evaluation range, since all the five lower sides XA to XE exist, the CPU 62 has a total lower side length that is the sum of the lengths of the five lower sides XA to XE as “1” or more. Calculate the value.

Ｓ１７２では、ＣＰＵ６２は、対象帯状領域ＬＡ１１の全ての評価範囲について、Ｓ１６８及びＳ１７０の処理が終了したのか否かを判断する。具体的には、ＣＰＵ６２は、対象帯状領域ＬＡ１１の縦方向の全範囲ＡＲの下端と、前回の評価範囲（例えばｐ回目の評価範囲）の下端と、が一致する場合には、全ての評価範囲について処理が終了したと判断して（Ｓ１７２でＹＥＳ）、Ｓ１７４に進む。一方、ＣＰＵ６２は、全ての評価範囲について処理が終了していないと判断する場合（Ｓ１７２でＮＯ）には、Ｓ１６８に戻り、評価範囲を新たに設定する。 In S172, the CPU 62 determines whether or not the processes of S168 and S170 have been completed for all the evaluation ranges of the target strip area LA11. Specifically, the CPU 62 determines that all the evaluation ranges when the lower end of the entire vertical range AR of the target strip-shaped region LA11 matches the lower end of the previous evaluation range (for example, the p-th evaluation range). (YES in S172), the process proceeds to S174. On the other hand, if the CPU 62 determines that the processing has not been completed for all the evaluation ranges (NO in S172), the CPU 62 returns to S168 and newly sets an evaluation range.

Ｓ１７４では、ＣＰＵ６２は、複数個の評価範囲について算出された複数個の合計下辺長さに基づいて、対象帯状領域ＬＡ１１の基準位置を決定する。具体的には、ＣＰＵ６２は、まず、複数個の評価範囲の中から、複数個の合計下辺長さのうちの最大の合計下辺長さが算出された１個の評価範囲（例えばｐ回目の評価範囲）を選択する。なお、ＣＰＵ６２は、複数個の評価範囲の中に、最大の合計下辺長さが算出された２個以上の評価範囲が存在する場合には、当該２個以上の評価範囲のうち、最初に設定された評価範囲を選択する。そして、ＣＰＵ６２は、選択済みの評価範囲の縦方向の中間位置を基準位置として決定する。即ち、図４の例では、対象帯状領域ＬＡ１１の縦方向において、５個の下辺ＸＡ〜ＸＥの近傍の位置、即ち、対象帯状領域ＬＡ１１の最下端の近傍の位置が、基準位置として決定される。 In S174, the CPU 62 determines the reference position of the target strip area LA11 based on the plurality of total lower side lengths calculated for the plurality of evaluation ranges. Specifically, the CPU 62 first has one evaluation range (for example, the p-th evaluation) in which the maximum total lower side length is calculated from the plurality of total lower side lengths. Range). In addition, when there are two or more evaluation ranges in which the maximum total lower side length is calculated among the plurality of evaluation ranges, the CPU 62 sets the first among the two or more evaluation ranges. Selected evaluation range. Then, the CPU 62 determines the intermediate position in the vertical direction of the selected evaluation range as the reference position. That is, in the example of FIG. 4, the position in the vicinity of the five lower sides XA to XE in the vertical direction of the target strip area LA11, that is, the position in the vicinity of the lowermost end of the target strip area LA11 is determined as the reference position. .

Ｓ１７６では、ＣＰＵ６２は、全ての帯状領域ＬＡ１１〜ＬＡ１４について、Ｓ１６６〜Ｓ１７４の処理が終了したのか否かを判断する。ＣＰＵ６２は、処理が終了していないと判断する場合（Ｓ１７６でＮＯ）には、Ｓ１６６において、未処理の帯状領域（例えばＬＡ１２）を処理対象として決定して、Ｓ１６８以降の各処理を再び実行する。この結果、４個の帯状領域ＬＡ１１〜ＬＡ１４に対応する４個の基準位置が決定される。そして、ＣＰＵ６２は、処理が終了したと判断する場合（Ｓ１７６でＹＥＳ）には、図４の処理を終了する。 In S176, the CPU 62 determines whether or not the processes of S166 to S174 have been completed for all the strip-shaped areas LA11 to LA14. If the CPU 62 determines that the process has not ended (NO in S176), the CPU 62 determines an unprocessed belt-like area (for example, LA12) as a process target in S166, and executes each process after S168 again. . As a result, four reference positions corresponding to the four belt-like regions LA11 to LA14 are determined. Then, if the CPU 62 determines that the process is complete (YES in S176), the process of FIG. 4 is terminated.

上述したように、本実施例では、基準位置を決定するために、各文字の単位領域の下辺の長さに着目している。従って、対象帯状領域ＬＡ１１内の縦方向の全範囲ＡＲのうちの比較的に上側の範囲では、通常、合計下辺長さが最大にならない。このために、Ｓ１６８では、対象帯状領域ＬＡ１１の縦方向の全範囲ＡＲの中間位置に１回目の評価範囲が設定され、その後、評価範囲を下側に移動させる。これにより、Ｓ１６８で設定される評価範囲の数を減らすことができ、この結果、基準位置を迅速に決定することができる。 As described above, in this embodiment, in order to determine the reference position, attention is paid to the length of the lower side of the unit area of each character. Therefore, the total lower side length is not normally maximized in the relatively upper range of the entire vertical range AR in the target band-shaped region LA11. For this reason, in S168, the first evaluation range is set at the middle position of the entire vertical range AR of the target band-shaped region LA11, and then the evaluation range is moved downward. As a result, the number of evaluation ranges set in S168 can be reduced, and as a result, the reference position can be quickly determined.

また、図４に示される他の例では、Ｓ１６６で処理対象として決定される対象帯状領域は、小文字のアルファベットである６個の文字「ｄ」〜「ｉ」を含む。文字「ｉ」以外の各文字については、１個の単位領域が決定されるが、文字「ｉ」については、２個の単位領域が決定される。文字「ｇ」の下辺Ｘｇは、他の５個の文字の下辺Ｘｄ〜Ｘｆ，Ｘｈ，Ｘｉ２よりも下側に存在している。この例では、下辺Ｘｉ１を含む評価範囲と、下辺Ｘｄ〜Ｘｆ，Ｘｈ，Ｘｉ２を含む評価範囲と、下辺Ｘｇを含む評価範囲と、のそれぞれについて、１以上の合計下辺長さが算出される。そして、下辺Ｘｄ〜Ｘｆ，Ｘｈ，Ｘｉ２を含む評価範囲について算出される合計下辺長さが最大になるので、当該評価範囲の中間位置が基準位置として決定される。このように、本実施例では、合計下辺長さが最大である評価範囲に基づいて基準位置が決定され、その基準位置に基づいて２行以上の文字列が結合される（図２のＳ２００の結合画像ＣＩ参照）。このために、ユーザが、処理済み画像ＰＩ（図２のＳ５００参照）内の文字列を構成する複数個の文字の並びを不自然に感じるのを抑制することができる。 In another example shown in FIG. 4, the target band-shaped area determined as the processing target in S166 includes six characters “d” to “i” that are lowercase alphabets. For each character other than the character “i”, one unit region is determined, but for the character “i”, two unit regions are determined. The lower side Xg of the character “g” exists below the lower sides Xd to Xf, Xh, and Xi2 of the other five characters. In this example, one or more total lower side lengths are calculated for each of the evaluation range including the lower side Xi1, the evaluation range including the lower sides Xd to Xf, Xh, and Xi2, and the evaluation range including the lower side Xg. Since the total lower side length calculated for the evaluation range including the lower sides Xd to Xf, Xh, and Xi2 is maximized, the intermediate position of the evaluation range is determined as the reference position. As described above, in this embodiment, the reference position is determined based on the evaluation range having the maximum total lower side length, and two or more lines of character strings are combined based on the reference position (in S200 of FIG. 2). (See the combined image CI). For this reason, it is possible to suppress the user from unnaturally feeling the arrangement of a plurality of characters constituting the character string in the processed image PI (see S500 in FIG. 2).

（修飾物解析処理；図５）
続いて、図５を参照して、図３のＳ１８０で実行される修飾物解析処理の内容を説明する。以下では、スキャン画像ＳＩ内のテキスト領域ＴＯＡを例として、図５の処理の内容を説明する。スキャン画像ＳＩ内に複数個のテキストオブジェクトが含まれる場合には、テキストオブジェクト毎（即ちテキスト領域毎）に図５の処理が実行される。 (Modification analysis process; FIG. 5)
Then, with reference to FIG. 5, the content of the modification analysis process performed by S180 of FIG. 3 is demonstrated. Hereinafter, the contents of the process of FIG. 5 will be described using the text area TOA in the scanned image SI as an example. When a plurality of text objects are included in the scanned image SI, the process of FIG. 5 is executed for each text object (that is, for each text area).

Ｓ１８１では、ＣＰＵ６２は、テキスト領域ＴＯＡの４個の帯状領域ＬＡ１１〜ＬＡ１４の縦方向に沿った４個の長さｈ１１〜ｈ１４を特定し、次いで、４個の長さｈ１１〜ｈ１４の平均値ｈａを算出する。 In S181, the CPU 62 specifies four lengths h11 to h14 along the vertical direction of the four belt-like regions LA11 to LA14 of the text region TOA, and then averages ha of the four lengths h11 to h14. Is calculated.

Ｓ１８２では、ＣＰＵ６２は、Ｓ１８１で算出された平均値ｈａに１／２を乗算して、閾値Ｔｈを算出する。これにより、ＣＰＵ６２は、４個の長さｈ１１〜ｈ１４に応じた閾値Ｔｈを設定することができる。閾値Ｔｈは、後述のＳ１８４の判断で利用される。なお、変形例では、閾値Ｔｈは、平均値ｈａに等しい値であってもよいし、平均値ｈａの１／３又は２／３であってもよい。即ち、閾値Ｔｈは、平均値ｈａ以下の値であればよい。 In S182, the CPU 62 calculates a threshold value Th by multiplying the average value ha calculated in S181 by 1/2. Thereby, CPU62 can set threshold value Th according to the four lengths h11-h14. The threshold value Th is used in the determination in S184 described later. In the modification, the threshold value Th may be a value equal to the average value ha, or may be 1/3 or 2/3 of the average value ha. That is, the threshold value Th should just be a value below the average value ha.

Ｓ１８３では、ＣＰＵ６２は、４個の帯状領域ＬＡ１１〜ＬＡ１４のうちの１個の帯状領域（以下では「対象帯状領域」と呼ぶ）を処理対象として決定する。 In S183, the CPU 62 determines one of the four strip regions LA11 to LA14 (hereinafter referred to as “target strip region”) as a processing target.

Ｓ１８４では、ＣＰＵ６２は、対象帯状領域の縦方向の長さが、Ｓ１８２で算出された閾値Ｔｈ以下であるのか否かを判断する。これにより、ＣＰＵ６２は、対象帯状領域が、文字列を含む文字列帯状領域であるのか、修飾線を含む修飾物帯状領域であるのか、を判断することができる。文字列帯状領域（例えばＬＡ１１，ＬＡ１２，ＬＡ１４）の縦方向の長さは、通常、閾値Ｔｈよりも大きい。従って、ＣＰＵ６２は、対象帯状領域の縦方向の長さが閾値Ｔｈよりも大きいと判断する場合（Ｓ１８４でＮＯ）には、対象帯状領域が文字列帯状領域であると判断する。また、修飾物帯状領域（例えばＬＡ１３）の縦方向の長さは、通常、閾値Ｔｈ以下である。従って、ＣＰＵ６２は、対象帯状領域の縦方向の長さが閾値Ｔｈ以下であると判断する場合（Ｓ１８４でＹＥＳ）には、対象帯状領域が修飾物帯状領域であると判断する。 In S184, the CPU 62 determines whether or not the length in the vertical direction of the target band-like region is equal to or less than the threshold Th calculated in S182. Thereby, the CPU 62 can determine whether the target strip-shaped region is a character string strip-shaped region including a character string or a modified strip-shaped region including a modifier line. The length in the vertical direction of the character string belt-like region (for example, LA11, LA12, LA14) is usually larger than the threshold Th. Accordingly, when the CPU 62 determines that the vertical length of the target strip area is larger than the threshold Th (NO in S184), the CPU 62 determines that the target strip area is a character string strip area. Further, the length in the vertical direction of the modified strip-like region (for example, LA13) is usually equal to or less than the threshold value Th. Accordingly, when the CPU 62 determines that the length in the vertical direction of the target strip area is equal to or less than the threshold Th (YES in S184), the CPU 62 determines that the target strip area is a modified strip area.

ＣＰＵ６２は、対象帯状領域が文字列帯状領域であると判断する場合（Ｓ１８４でＮＯ）には、Ｓ１９５をスキップしてＳ１９８に進む。一方、ＣＰＵ６２は、対象帯状領域が修飾物帯状領域であると判断する場合（Ｓ１８４でＹＥＳ）には、Ｓ１９５において、対象帯状領域と上行の帯状領域とを結合する。上行の帯状領域は、対象帯状領域の上側において、対象帯状領域の隣に存在する帯状領域である。具体的には、ＣＰＵ６２は、対象帯状領域と上行の帯状領域との双方に外接する領域を新たな帯状領域として決定する。また、ＣＰＵ６２は、新たな帯状領域の基準位置として、対象帯状領域の基準位置ではなく、上行の帯状領域の基準位置を利用する。例えば、ＣＰＵ６２は、対象帯状領域である帯状領域ＬＡ１３と、上行の帯状領域である帯状領域ＬＡ１２と、を結合して、新たな帯状領域ＬＡ１２’を決定する。この際に、ＣＰＵ６２は、新たな帯状領域ＬＡ１２’の基準位置として、上行の帯状領域ＬＡ１２の基準位置を利用し、対象帯状領域ＬＡ１３の基準位置を利用しない（即ち、対象帯状領域ＬＡ１３とその基準位置をメモリ６４から消去する）。 If the CPU 62 determines that the target belt-like region is a character string belt-like region (NO in S184), it skips S195 and proceeds to S198. On the other hand, if the CPU 62 determines that the target strip-shaped region is a modified strip-shaped region (YES in S184), the target strip-shaped region and the upper strip strip-shaped region are combined in S195. The upper belt-like region is a belt-like region that exists next to the target belt-like region above the target belt-like region. Specifically, the CPU 62 determines a region circumscribing both the target strip region and the upper strip strip region as a new strip region. In addition, the CPU 62 uses the reference position of the upper belt-like region instead of the reference position of the target belt-like region as the reference position of the new belt-like region. For example, the CPU 62 combines the belt-like region LA13 that is the target belt-like region and the belt-like region LA12 that is the upper belt-like region to determine a new belt-like region LA12 '. At this time, the CPU 62 uses the reference position of the upper belt-like area LA12 as the reference position of the new belt-like area LA12 ′ and does not use the reference position of the target belt-like area LA13 (that is, the target belt-like area LA13 and its reference). The position is deleted from the memory 64).

上述したように、本実施例では、ＣＰＵ６２は、対象帯状領域が修飾物帯状領域ＬＡ１３であると判断する場合には、修飾物帯状領域ＬＡ１３と文字列帯状領域ＬＡ１２とを結合して、新たな帯状領域ＬＡ１２’を決定する。これにより、文字列「Ｆ〜Ｊ」と修飾線との双方が１個の帯状領域ＬＡ１２’内に含まれることになる。この結果、以降の処理では、ＣＰＵ６２は、文字列「Ｆ〜Ｊ」と修飾線とを分けて扱うのではなく、文字列「Ｆ〜Ｊ」と修飾線とを１行の修飾文字列として扱うことになる。なお、以下では、帯状領域ＬＡ１２’のことを「修飾文字列帯状領域」と呼ぶことがある。また、修飾文字列帯状領域ＬＡ１２’では、修飾物帯状領域ＬＡ１３の基準位置ではなく、文字列帯状領域ＬＡ１２の基準位置が利用される。そして、その基準位置に基づいて２行以上の文字列が結合されるので（図２のＳ２００の結合画像ＣＩ参照）、ユーザが、処理済み画像ＰＩ（図２のＳ５００参照）内の文字列を構成する複数個の文字の並びを不自然に感じるのを抑制することができる。 As described above, in this embodiment, when the CPU 62 determines that the target band-shaped area is the modified band-shaped area LA13, the modified band-shaped area LA13 and the character string band-shaped area LA12 are combined to form a new one. A band-shaped area LA12 ′ is determined. As a result, both the character string “F to J” and the modification line are included in one band-shaped area LA12 ′. As a result, in the subsequent processing, the CPU 62 does not handle the character string “F to J” and the modification line separately, but handles the character string “F to J” and the modification line as one line of the modification character string. It will be. Hereinafter, the band-shaped area LA12 'may be referred to as a “modified character string band-shaped area”. In the modified character string strip area LA12 ', the reference position of the character string strip area LA12 is used instead of the reference position of the modified article strip area LA13. Since two or more lines of character strings are combined based on the reference position (see the combined image CI in S200 in FIG. 2), the user selects the character string in the processed image PI (see S500 in FIG. 2). It is possible to suppress an unnatural feeling of the arrangement of a plurality of characters.

Ｓ１９８では、ＣＰＵ６２は、対象テキスト領域ＴＯＡに含まれる全ての帯状領域ＬＡ１１〜ＬＡ１４について、Ｓ１８３〜Ｓ１９５の処理が終了したのか否かを判断する。ＣＰＵ６２は、処理が終了していないと判断する場合（Ｓ１９８でＮＯ）には、Ｓ１８３において、未処理の帯状領域（例えばＬＡ１２）を処理対象として決定して、Ｓ１８４以降の各処理を再び実行する。そして、ＣＰＵ６２は、処理が終了したと判断する場合（Ｓ１９８でＹＥＳ）には、図５の処理を終了する。図５の例では、４個の帯状領域ＬＡ１１〜ＬＡ１４に基づいて、３個の帯状領域ＬＡ１１，ＬＡ１２’，ＬＡ１４が決定される。 In S198, the CPU 62 determines whether or not the processes of S183 to S195 have been completed for all the strip-like areas LA11 to LA14 included in the target text area TOA. If the CPU 62 determines that the process has not been completed (NO in S198), in S183, it determines an unprocessed belt-like area (for example, LA12) as a process target and executes each process after S184 again. . If the CPU 62 determines that the process has been completed (YES in S198), the process in FIG. 5 is terminated. In the example of FIG. 5, three strip regions LA11, LA12 ′, LA14 are determined based on the four strip regions LA11 to LA14.

（結合処理；図６）
続いて、図６を参照して、図２のＳ２００で実行される結合処理の内容を説明する。Ｓ２１０では、ＣＰＵ６２は、スキャン画像ＳＩ内の１個以上のテキスト領域のうちの１個のテキスト領域（以下では「対象テキスト領域」と呼ぶ）を処理対象として決定する。以下では、テキスト領域ＴＯＡが対象テキスト領域として決定される場合を例として説明する。 (Combining process; FIG. 6)
Next, the contents of the combining process executed in S200 of FIG. 2 will be described with reference to FIG. In S210, the CPU 62 determines one text area (hereinafter referred to as “target text area”) among one or more text areas in the scanned image SI as a processing target. Hereinafter, a case where the text area TOA is determined as the target text area will be described as an example.

Ｓ２２０では、ＣＰＵ６２は、対象テキスト領域ＴＯＡに含まれる３行の文字列「Ａ〜Ｍ」が結合されている結合画像ＣＩを表わす結合画像データを生成する。具体的には、ＣＰＵ６２は、まず、スキャン画像データＳＩＤの中から、テキスト領域ＴＯＡについて決定された３個の帯状領域ＬＡ１１，ＬＡ１２’，ＬＡ１４（図５参照）を表わす３個の部分画像データを取得する。そして、ＣＰＵ６２は、図６の（１）〜（３）に示されるように、各部分画像データを利用して、結合画像データを生成する。以下では、（１）〜（３）の内容を詳しく説明する。 In S220, the CPU 62 generates combined image data representing a combined image CI in which three lines of character strings “A to M” included in the target text area TOA are combined. Specifically, the CPU 62 first outputs three partial image data representing the three belt-like areas LA11, LA12 ′, LA14 (see FIG. 5) determined for the text area TOA from the scanned image data SID. get. Then, as shown in (1) to (3) of FIG. 6, the CPU 62 generates combined image data using each partial image data. Below, the content of (1)-(3) is demonstrated in detail.

（１）に示されるように、ＣＰＵ６２は、最も上の帯状領域ＬＡ１１を表わす第１の部分画像データの後端と、上から２番目の帯状領域ＬＡ１２’を表わす第２の部分画像データの先端と、を結合して、中間画像ＭＩ１を表わす中間画像データを生成する。中間画像ＭＩ１は、帯状領域ＬＡ１１内の文字列「Ａ〜Ｅ」と、帯状領域ＬＡ１２’内の文字列「Ｆ〜Ｊ」と、が結合された文字列「Ａ〜Ｊ」を含む。この際に、ＣＰＵ６２は、結合される２行の文字列の間に横方向の所定の長さの余白（即ち「Ｅ」と「Ｆ」の間の余白）が形成されるように、当該余白を表わす画素、より具体的には、スキャン画像ＳＩ内の背景色を有する画素を補充して、中間画像データを生成する。即ち、ＣＰＵ６２は、当該余白を表わす画素を介して、第１及び第２の部分画像データを結合する。さらに、ＣＰＵ６２は、２個の帯状領域ＬＡ１１，ＬＡ１２’について決定された２個の基準位置（図４のＳ１７４参照）が縦方向の同じ位置に存在するように、第１及び第２の部分画像データを結合する。帯状領域ＬＡ１２’が文字列「Ｆ〜Ｊ」のみならず装飾線を含むので、帯状領域ＬＡ１２’の縦方向の長さは、帯状領域ＬＡ１１の縦方向の長さよりも大きい。従って、中間画像ＭＩ１では、帯状領域ＬＡ１１に対応する部分（即ち「Ａ〜Ｅ」）と、帯状領域ＬＡ１２’に対応する部分（即ち「Ｆ〜Ｊ」）と、の間に段差が形成されている。 As shown in (1), the CPU 62 determines the rear end of the first partial image data representing the uppermost band-shaped area LA11 and the front end of the second partial image data representing the second band-shaped area LA12 ′ from the top. Are combined to generate intermediate image data representing the intermediate image MI1. The intermediate image MI1 includes a character string “A to J” in which the character string “A to E” in the strip region LA11 and the character string “F to J” in the strip region LA12 ′ are combined. At this time, the CPU 62 determines that a margin of a predetermined length in the horizontal direction (that is, a margin between “E” and “F”) is formed between the two character strings to be combined. , More specifically, pixels having a background color in the scanned image SI are supplemented to generate intermediate image data. That is, the CPU 62 combines the first and second partial image data via the pixel representing the margin. Further, the CPU 62 determines the first and second partial images so that the two reference positions (see S174 in FIG. 4) determined for the two belt-like areas LA11 and LA12 ′ are present at the same position in the vertical direction. Merge data. Since the strip-shaped area LA12 'includes a decorative line as well as the character strings "F to J", the vertical length of the strip-shaped area LA12' is larger than the vertical length of the strip-shaped area LA11. Accordingly, in the intermediate image MI1, a step is formed between the portion corresponding to the strip region LA11 (ie, “A to E”) and the portion corresponding to the strip region LA12 ′ (ie, “F to J”). Yes.

次いで、（２）に示されるように、ＣＰＵ６２は、（１）の中間画像データの後端と、最も下の帯状領域ＬＡ１４を表わす第３の部分画像データの先端と、を結合して、中間画像ＭＩ２を表わす中間画像データを生成する。（１）と同様に、ＣＰＵ６２は、所定の余白を表わす画素を介して、（１）の中間画像データと第３の部分画像データとを結合する。また、ＣＰＵ６２は、３個の帯状領域ＬＡ１１，ＬＡ１２’，ＬＡ１４について決定された３個の基準位置が縦方向の同じ位置に存在するように、（１）の中間画像データと第３の部分画像データとを結合する。中間画像ＭＩ２では、帯状領域ＬＡ１２’に対応する部分（即ち「Ｆ〜Ｊ」）と、帯状領域ＬＡ１４に対応する部分（即ち「Ｋ〜Ｍ」）と、の間に段差が形成されている。 Next, as shown in (2), the CPU 62 combines the rear end of the intermediate image data in (1) and the front end of the third partial image data representing the lowermost band-like area LA14 to obtain an intermediate Intermediate image data representing the image MI2 is generated. Similar to (1), the CPU 62 combines the intermediate image data of (1) and the third partial image data via pixels representing a predetermined margin. Further, the CPU 62 determines that the intermediate image data of (1) and the third partial image are such that the three reference positions determined for the three strip regions LA11, LA12 ′, LA14 are present at the same position in the vertical direction. Combine with data. In the intermediate image MI2, a step is formed between a portion corresponding to the strip region LA12 '(ie, "F to J") and a portion corresponding to the strip region LA14 (ie, "K to M").

最後に、（３）に示されるように、ＣＰＵ６２は、中間画像ＭＩ２に外接する矩形形状を有する結合画像ＣＩが形成されるように、中間画像ＭＩ２を表わす中間画像データに余白領域を表わす画素、より具体的には、スキャン画像ＳＩ内の背景色を有する画素を補充する。これにより、帯状領域ＬＡ１１に対応する部分（即ち「Ａ〜Ｅ」）と帯状領域ＬＡ１２’に対応する部分（即ち「Ｆ〜Ｊ」）との間の段差、及び、帯状領域ＬＡ１２’に対応する部分（即ち「Ｆ〜Ｊ」）と帯状領域ＬＡ１４に対応する部分（即ち「Ｋ〜Ｍ」）との間の段差がなくなり、矩形形状を有する結合画像ＣＩを表わす結合画像データが完成する。 Finally, as shown in (3), the CPU 62 creates a pixel representing a blank area in the intermediate image data representing the intermediate image MI2, so that a combined image CI having a rectangular shape circumscribing the intermediate image MI2 is formed. More specifically, pixels having a background color in the scanned image SI are supplemented. Accordingly, the step between the portion corresponding to the strip region LA11 (that is, “A to E”) and the portion corresponding to the strip region LA12 ′ (that is, “F to J”) and the strip region LA12 ′. The step between the portion (ie, “F to J”) and the portion corresponding to the belt-like region LA14 (ie, “K to M”) is eliminated, and the combined image data representing the combined image CI having a rectangular shape is completed.

なお、仮に、帯状領域ＬＡ１２と帯状領域ＬＡ１３とが結合されていなければ（即ち図５の処理が実行されなければ）、Ｓ２２０において、以下の処理が実行され得る。即ち、まず、帯状領域ＬＡ１１内の文字列「Ａ〜Ｅ」と帯状領域ＬＡ１２内の文字列「Ｆ〜Ｊ」とが横方向に沿って直線状に並ぶように２行の文字列が結合され、その後、文字列「Ａ〜Ｊ」と帯状領域ＬＡ１３内の修飾線とが横方向に沿って直線状に並ぶように、文字列と修飾線とが結合される。この場合、文字列「Ａ〜Ｊ」が得られる。そして、文字列「Ａ〜Ｊ」と文字列「Ｋ〜Ｍ」とが横方向に沿って直線状に並ぶように、２行の文字列が結合される。この結果、最終的に得られる結合文字列は、「Ａ〜ＪＫ〜Ｍ」である。即ち、文字列「ＦＧＨ」が修飾線によって修飾されていない結合文字列が得られる。これに対し、本実施例では、帯状領域ＬＡ１２と帯状領域ＬＡ１３とが結合されて、１個の修飾文字列帯状領域ＬＡ１２’が決定される（図５のＳ１９５）。これにより、ＣＰＵ６２は、帯状領域ＬＡ１２内の文字列「Ｆ〜Ｊ」と帯状領域ＬＡ１３内の修飾線とを１行の修飾文字列として扱って、Ｓ２２０の結合を実行することができる。この結果、文字列「ＦＧＨ」が修飾線によって修飾されている結合文字列「Ａ〜Ｍ」を含む結合画像ＣＩを表わす結合画像データが生成される。 If the strip area LA12 and the strip area LA13 are not coupled (that is, the process of FIG. 5 is not executed), the following process may be executed in S220. That is, first, two character strings are combined so that the character strings “A to E” in the belt-shaped area LA11 and the character strings “F to J” in the belt-shaped area LA12 are arranged in a straight line along the horizontal direction. Thereafter, the character string and the modification line are combined so that the character string “A to J” and the modification line in the strip-shaped region LA13 are arranged in a straight line along the horizontal direction. In this case, the character string “A to J Is obtained. And the character string “A ~ J ”And the character strings“ K to M ”are arranged in a straight line along the horizontal direction, and the two character strings are combined. As a result, the finally obtained combined character string is “A to J K to M ". That is, a combined character string in which the character string “FGH” is not modified by the modification line is obtained. On the other hand, in this embodiment, the band-shaped area LA12 and the band-shaped area LA13 are combined to determine one modified character string band-shaped area LA12 ′ (S195 in FIG. 5). Thereby, the CPU 62 can treat the character string “F to J” in the belt-shaped area LA12 and the modification line in the belt-shaped area LA13 as one line of the modified character string and execute the combination of S220. As a result, combined image data representing the combined image CI including the combined character string “A to M” in which the character string “FGH” is modified by the modification line is generated.

なお、図５の例では、英語の文字列に修飾線が付されている状況が想定されているが、英語ではなく、日本語、中国語等の他の言語に修飾線が付されている場合でも、文字列と修飾線とが１行の修飾文字列として扱われて、文字列が修飾線によって修飾されている文字列を表わす結合画像データが生成される。 In the example of FIG. 5, it is assumed that a modification line is attached to an English character string, but a modification line is attached to another language such as Japanese or Chinese instead of English. Even in this case, the character string and the modification line are handled as one line of the modification character string, and combined image data representing the character string in which the character string is modified by the modification line is generated.

Ｓ２３０では、ＣＰＵ６２は、Ｓ２２０で生成された結合画像データを利用して分断候補位置決定処理（後述の図７参照）を実行して、結合画像データを分断するための候補の位置を決定する。 In S230, the CPU 62 executes a division candidate position determination process (see FIG. 7 described later) using the combined image data generated in S220, and determines a candidate position for dividing the combined image data.

Ｓ２５０では、ＣＰＵ６２は、スキャン画像ＳＩ内の全てのテキスト領域について、Ｓ２１０〜Ｓ２３０の処理が終了したのか否かを判断する。ＣＰＵ６２は、処理が終了していないと判断する場合（Ｓ２５０でＮＯ）には、Ｓ２１０において、未処理のテキスト領域を処理対象として決定する。そして、ＣＰＵ６２は、処理が終了したと判断する場合（Ｓ２５０でＹＥＳ）には、図６の処理を終了する。 In S250, the CPU 62 determines whether or not the processes in S210 to S230 have been completed for all text regions in the scanned image SI. If the CPU 62 determines that the process has not ended (NO in S250), the CPU 62 determines an unprocessed text area as a processing target in S210. Then, when the CPU 62 determines that the process has been completed (YES in S250), the process of FIG. 6 is terminated.

（分断候補位置決定処理；図７）
続いて、図７を参照して、図６のＳ２３０で実行される分断候補位置決定処理の内容を説明する。以下では、英語のセンテンス「ＩｓａｉｄＩｈａｖｅａｄｒｅａｍ．」を含む結合画像ＣＩを表わす結合画像データが利用される場合を例として、処理の内容を説明する。当該センテンスのうち、「Ｉｓａｉｄ」には修飾線が付されておらず、「Ｉｈａｖｅａｄｒｅａｍ．」には修飾線が付されている。 (Division candidate position determination processing; FIG. 7)
Next, the contents of the division candidate position determination process executed in S230 of FIG. 6 will be described with reference to FIG. In the following, the contents of the processing will be described by taking as an example the case where the combined image data representing the combined image CI including the English sentence “I said I have a dream.” Is used. Among the sentences, “I said” is not attached with a modification line, and “I have a dream.” Is attached with a modification line.

Ｓ２３４では、ＣＰＵ６２は、結合画像データに対して二値化処理を実行する。当該二値化処理の内容は、図３のＳ１１０と同様である。 In S234, the CPU 62 executes binarization processing on the combined image data. The contents of the binarization process are the same as S110 in FIG.

Ｓ２３６では、ＣＰＵ６２は、Ｓ２３４で生成された二値データを利用して、結合画像データに対応する射影ヒストグラムを生成する。当該射影ヒストグラムは、二値データを構成する各画素を縦方向に射影する場合におけるＯＮ画素（即ち「１」を示す画素）の頻度の分布を示す。当該射影ヒストグラムでは、各文字及び修飾線が、頻度がゼロより高い範囲で表わされ、２個の文字の間の余白部分（例えば、「Ｉｓａｉｄ」において、「Ｉ」と「ｓ」の間の余白部分、「ｓ」と「ａ」の間の余白部分等）が、頻度がゼロである範囲で表わされる。 In S236, the CPU 62 generates a projection histogram corresponding to the combined image data using the binary data generated in S234. The projection histogram shows the frequency distribution of ON pixels (that is, pixels indicating “1”) when the pixels constituting the binary data are projected in the vertical direction. In the projection histogram, each character and the modification line are expressed in a range in which the frequency is higher than zero, and a blank portion between two characters (for example, between “I” and “s” in “I Said”). Marginal part, a marginal part between “s” and “a”, etc.) is represented in a range where the frequency is zero.

Ｓ２３８では、ＣＰＵ６２は、結合画像ＣＩ内において、文字構成画素が存在する領域と、文字構成画素が存在しない領域と、を区別するための閾値を設定する。具体的には、ＣＰＵ６２は、原則として、ゼロを閾値として設定する。ただし、ＣＰＵ６２は、Ｓ２３６で生成された射影ヒストグラムの中に１個以上の連続範囲が存在する場合には、１個以上の連続範囲を選択して、選択済みの１個以上の連続範囲のそれぞれについて、当該連続範囲内の頻度の最小値（即ちゼロより大きい値）を閾値として決定する。即ち、ＣＰＵ６２は、連続範囲についてゼロより大きい値を閾値として決定し、連続範囲以外の範囲についてゼロを閾値として決定する。連続範囲は、例えば、センテンスの中に修飾線が含まれる場合に、当該修飾線を表わす範囲である。修飾線がＯＮ画素で表わされるので、射影ヒストグラム内の修飾線に対応する範囲は、頻度がゼロより大きくなり、かつ、横方向に比較的に長くなる。このために、本実施例では、ＣＰＵ６２は、頻度がゼロより高く、かつ、所定の長さ以上の横方向の長さを有する範囲を、連続範囲として選択する。上記の所定の長さは、スキャン画像データＳＩＤの解像度に応じて予め決定されている。例えば、スキャン画像データＳＩＤの解像度が３００ｄｐｉである場合には、上記の所定の長さは５０画素であり、解像度が６００ｄｐｉである場合には、上記の所定の長さは１００画素である。上記の所定の長さは、装飾線の存在を特定可能な長さであればどのような値であってもよいが、例えば、１個の文字の横方向の長さよりも大きい値である。ここで決定される閾値は、後述のＳ２４０及びＳ２４４で利用される。 In S 238, the CPU 62 sets a threshold value for distinguishing between a region in which character constituent pixels are present and a region in which character constituent pixels are not present in the combined image CI. Specifically, the CPU 62 sets zero as a threshold value in principle. However, if there is one or more continuous ranges in the projection histogram generated in S236, the CPU 62 selects one or more continuous ranges and selects each of the selected one or more continuous ranges. , A minimum value of the frequency within the continuous range (that is, a value greater than zero) is determined as a threshold value. That is, the CPU 62 determines a value larger than zero as the threshold value for the continuous range, and determines zero as a threshold value for a range other than the continuous range. The continuous range is, for example, a range representing the modification line when the modification line is included in the sentence. Since the modification line is represented by ON pixels, the range corresponding to the modification line in the projection histogram has a frequency greater than zero and is relatively long in the horizontal direction. For this reason, in this embodiment, the CPU 62 selects a range having a frequency that is higher than zero and has a lateral length equal to or greater than a predetermined length as a continuous range. The predetermined length is determined in advance according to the resolution of the scanned image data SID. For example, when the resolution of the scan image data SID is 300 dpi, the predetermined length is 50 pixels, and when the resolution is 600 dpi, the predetermined length is 100 pixels. The predetermined length may be any value as long as the presence of the decoration line can be specified. For example, the predetermined length is a value larger than the horizontal length of one character. The threshold value determined here is used in S240 and S244 described later.

Ｓ２４０では、ＣＰＵ６２は、Ｓ２３６で生成された射影ヒストグラムと、Ｓ２３８で決定された閾値と、を利用して、１個の中間余白領域を処理対象として決定する。中間余白領域は、２個の文字の間の余白部分に対応する領域である。具体的には、中間余白領域は、Ｓ２３８で決定された閾値よりも高い頻度を有する２個の領域に挟まれた領域であって、当該閾値以下の頻度を有する領域である。例えば、図７の結合画像ＣＩでは、修飾線が付されていない文字列「Ｉｓａｉｄ」に対応する領域について、頻度ゼロが閾値として決定される。この場合、例えば、ゼロより高い頻度を有する２個の領域（即ち「Ｉ」の領域と「ｓ」の領域）に挟まれた領域ＢＡ１（即ち頻度ゼロである領域ＢＡ１）が、中間余白領域である。１回目のＳ２４０では、ＣＰＵ６２は、最も先端側（即ち左側）に存在する１個の中間余白領域（図７の結合画像ＣＩでは領域ＢＡ１）を処理対象として決定する。そして、２回目以降のＳ２４０では、ＣＰＵ６２は、前回の処理対象の中間余白領域の右側に存在する１個以上の中間余白領域のうち、最も先端側に存在する１個の中間余白領域（例えば「ｓａｉｄ」のうちの「ｓ」と「ａ」の間の領域）を今回の処理対象として決定する。 In S240, the CPU 62 determines one intermediate blank area as a processing target using the projection histogram generated in S236 and the threshold value determined in S238. The intermediate margin area is an area corresponding to a margin portion between two characters. Specifically, the middle blank area is an area between two areas having a frequency higher than the threshold determined in S238 and having a frequency equal to or lower than the threshold. For example, in the combined image CI of FIG. 7, the frequency zero is determined as the threshold value for the region corresponding to the character string “I Said” without the modification line. In this case, for example, a region BA1 (that is, a region BA1 having a frequency of zero) sandwiched between two regions having a frequency higher than zero (that is, an “I” region and an “s” region) is an intermediate blank region. is there. In the first S240, the CPU 62 determines one intermediate blank area (area BA1 in the combined image CI in FIG. 7) existing on the most front end side (that is, the left side) as a processing target. Then, in the second and subsequent S240s, the CPU 62 selects one intermediate blank area (for example, “for example,“ at the forefront side) among one or more intermediate blank areas existing on the right side of the previous intermediate blank area to be processed. The area between “s” and “a” in “said”) is determined as the current processing target.

Ｓ２４２では、ＣＰＵ６２は、処理対象の中間余白領域の横方向の長さがｈ／４未満であるのか否かを判断する。ここで、「ｈ」は、結合画像ＣＩの縦方向の長さ（即ち縦方向の画素数）である。 In S242, the CPU 62 determines whether the horizontal length of the intermediate blank area to be processed is less than h / 4. Here, “h” is the length of the combined image CI in the vertical direction (that is, the number of pixels in the vertical direction).

ＣＰＵ６２は、処理対象の中間余白領域の横方向の長さがｈ／４以上であると判断する場合（Ｓ２４２でＮＯ）、換言すれば、当該中間余白領域が比較的に大きいと判断する場合には、Ｓ２４６において、当該中間余白領域の右端を分断候補位置として決定する。このように、１個の文字内の位置ではなく、余白領域が分断候補位置として決定されるので、１個の文字（例えば「Ａ」）が分断されてしまうことを抑制することができる。また、中間余白領域の右端が分断候補位置として決定される理由は、以下のとおりである。例えば、１行の文字列に含まれる２個の中間余白領域のそれぞれの右端で当該文字列が分断されて、縦方向に沿って並ぶ３行目の文字列が再配置される状況を想定する。この場合、２行目及び３行目の文字列の左側に余白が形成されないので、２行目及び３行目の文字列の先端を揃えることができる。このように、２行目以降の文字列の先端を揃えることができるので、再配置される複数行の文字列の見た目を美しくすることができる。なお、変形例では、Ｓ２４６において、ＣＰＵ６２は、中間余白領域の右端以外の位置（例えば、左端、中間位置等）を分断候補位置として決定してもよい。Ｓ２４６が終了すると、Ｓ２４８に進む。 When determining that the horizontal length of the intermediate blank area to be processed is equal to or greater than h / 4 (NO in S242), in other words, the CPU 62 determines that the intermediate blank area is relatively large. In S246, the right end of the intermediate blank area is determined as the division candidate position. Thus, since the blank area is determined as the division candidate position instead of the position in one character, it is possible to suppress the division of one character (for example, “A”). The reason why the right end of the middle blank area is determined as the division candidate position is as follows. For example, a situation is assumed in which the character string is divided at the right end of each of the two middle blank areas included in one line of character string, and the third line of character strings arranged in the vertical direction is rearranged. . In this case, since no margin is formed on the left side of the character strings on the second and third lines, the leading ends of the character strings on the second and third lines can be aligned. As described above, since the leading ends of the character strings in the second and subsequent lines can be aligned, the appearance of the rearranged character strings in a plurality of lines can be made beautiful. In a modified example, in S246, the CPU 62 may determine a position other than the right end of the intermediate blank area (for example, the left end, the intermediate position, etc.) as the division candidate position. When S246 ends, the process proceeds to S248.

一方、ＣＰＵ６２は、処理対象の中間余白領域の横方向の長さがｈ／４未満であると判断する場合（Ｓ２４２でＹＥＳ）、換言すれば、当該中間余白領域が比較的に小さいと判断する場合には、Ｓ２４４において、左側隣接領域と右側隣接領域との少なくとも一方の横方向の長さがｈ／２未満であるのか否かを判断する。左側（又は右側）隣接領域は、処理対象の中間余白領域の左側（又は右側）で当該中間余白領域に隣接する領域であって、Ｓ２３８で決定された閾値よりも高い頻度を有する領域である。例えば、「ｓａｉｄ」のうちの「ｓ」と「ａ」との間の中間余白領域では、「ｓ」に対応する領域、「ａ」に対応する領域が、それぞれ、左側隣接領域、右側隣接領域である。 On the other hand, when the CPU 62 determines that the horizontal length of the intermediate blank area to be processed is less than h / 4 (YES in S242), in other words, determines that the intermediate blank area is relatively small. In this case, in S244, it is determined whether or not the lateral length of at least one of the left adjacent area and the right adjacent area is less than h / 2. The left side (or right side) adjacent region is a region adjacent to the intermediate margin region on the left side (or right side) of the intermediate blank region to be processed, and has a frequency higher than the threshold value determined in S238. For example, in the intermediate blank area between “s” and “a” in “said”, the area corresponding to “s” and the area corresponding to “a” are the left adjacent area and the right adjacent area, respectively. It is.

ＣＰＵ６２は、左側隣接領域と右側隣接領域との双方の横方向の長さがｈ／２以上であると判断する場合（Ｓ２４４でＮＯ）、例えば、左側隣接領域と右側隣接領域との双方に比較的に大きな文字（例えば、アルファベットの大文字、漢字、日本語の仮名等）が存在する場合には、Ｓ２４６において、中間余白領域の右端を分断候補位置として決定する。一方、ＣＰＵ６２は、左側隣接領域と右側隣接領域との少なくとも一方の横方向の長さがｈ／２未満であると判断する場合（Ｓ２４４でＹＥＳ）、例えば、左側隣接領域と右側隣接領域との少なくとも一方に比較的に小さな文字（例えばアルファベットの小文字）又は記号（例えば、カンマ、ピリオド、引用符号等）が存在する場合には、Ｓ２４６を実行せずに、Ｓ２４８に進む。即ち、ＣＰＵ６２は、今回の処理対象の中間余白領域を分断候補位置として決定しない。 When the CPU 62 determines that the lateral lengths of both the left adjacent area and the right adjacent area are equal to or greater than h / 2 (NO in S244), for example, the CPU 62 compares the left adjacent area with the right adjacent area. If there is a large character (for example, uppercase alphabetic characters, kanji characters, Japanese kana characters, etc.), the right end of the middle blank area is determined as the division candidate position in S246. On the other hand, when the CPU 62 determines that the horizontal length of at least one of the left adjacent area and the right adjacent area is less than h / 2 (YES in S244), for example, the left adjacent area and the right adjacent area If there is a relatively small character (for example, lowercase alphabet) or a symbol (for example, comma, period, quotation mark, etc.) in at least one of them, the process proceeds to S248 without executing S246. That is, the CPU 62 does not determine the intermediate blank area to be processed this time as the division candidate position.

Ｓ２４８では、ＣＰＵ６２は、結合画像ＣＩに含まれる全ての中間余白領域について、Ｓ２４０〜Ｓ２４６の処理が終了したのか否かを判断する。ＣＰＵ６２は、処理が終了していないと判断する場合（Ｓ２４８でＮＯ）には、Ｓ２４０において、未処理の中間余白領域を処理対象として決定して、Ｓ２４２以降の各処理を再び実行する。そして、ＣＰＵ６２は、処理が終了したと判断する場合（Ｓ２４８でＹＥＳ）には、図７の処理を終了する。 In S248, the CPU 62 determines whether or not the processes in S240 to S246 have been completed for all intermediate blank areas included in the combined image CI. If the CPU 62 determines that the process has not ended (NO in S248), the CPU 62 determines an unprocessed intermediate blank area as a process target in S240, and executes each process after S242 again. Then, if the CPU 62 determines that the process has been completed (YES in S248), the process of FIG. 7 is terminated.

（分断位置決定処理の具体例；図８）
続いて、図８を参照して、図７の分断位置決定処理の具体例を説明する。ケースＡの結合画像は、図７の結合画像ＣＩと同じものである。「Ｉｓａｉｄ」のうちの「Ｉ」と「ｓ」の間の領域ＢＡ１が１個目の処理対象の中間余白領域として決定される（Ｓ２４０）。中間余白領域ＢＡ１は、単語「Ｉ」と単語「ｓａｉｄ」の間の余白（いわゆるスペース）に相当し、通常、ｈ／４以上の横方向の長さを有する（Ｓ２４２でＮＯ）。従って、中間余白領域ＢＡ１が分断候補位置として決定される（Ｓ２４６）。２個の英単語「Ｉ」，「ｓａｉｄ」の間の余白で文字列が分断されて改行されても、ユーザが分断後の各文字列を読み難いと感じる可能性が低いので、本実施例では、中間余白領域ＢＡ１が分断候補位置として決定される。 (Specific example of dividing position determination processing; FIG. 8)
Next, a specific example of the dividing position determination process in FIG. 7 will be described with reference to FIG. The combined image in case A is the same as the combined image CI in FIG. An area BA1 between “I” and “s” in “I Said” is determined as the first intermediate blank area to be processed (S240). The intermediate margin area BA1 corresponds to a margin (so-called space) between the word “I” and the word “said”, and usually has a horizontal length of h / 4 or more (NO in S242). Therefore, the middle blank area BA1 is determined as the division candidate position (S246). Even if the character string is divided at the blank space between the two English words “I” and “said” and the line breaks, it is unlikely that the user will find it difficult to read the divided character strings. Then, the intermediate margin area BA1 is determined as the division candidate position.

次いで、「Ｉｓａｉｄ」のうちの「ｓ」と「ａ」の間の領域ＢＡ２が２個目の処理対象の中間余白領域として決定される（Ｓ２４０）。中間余白領域ＢＡ２は、１個の英単語「ｓａｉｄ」を構成する２個の文字（即ち「ｓ」と「ａ」）の間の余白に相当し、通常、ｈ／４未満の横方向の長さを有する（Ｓ２４２でＹＥＳ）。また、中間余白領域ＢＡ２の左側隣接領域、右側隣接領域は、それぞれ、「ｓａｉｄ」のうちの「ｓ」、「ａ」に相当し、通常、ｈ／２未満の横方向の長さを有する（Ｓ２４４でＹＥＳ）。従って、中間余白領域ＢＡ２が分断候補位置として決定されない。１個の英単語（例えば「ｓａｉｄ」）を構成する２個の文字（例えば「ｓ」と「ａ」）の間の余白で文字列が分断されて改行されると、ユーザが分断後の各文字列を読み難いと感じる可能性が高いので、本実施例では、中間余白領域ＢＡ２が分断候補位置として決定されない。 Next, an area BA2 between “s” and “a” in “I Said” is determined as the second intermediate blank area to be processed (S240). The middle margin area BA2 corresponds to a margin between two letters (that is, “s” and “a”) constituting one English word “said”, and is generally a horizontal length of less than h / 4. (YES in S242). Further, the left side adjacent area and the right side adjacent area of the middle blank area BA2 correspond to “s” and “a” of “said”, respectively, and generally have a lateral length of less than h / 2 ( YES in S244). Therefore, the intermediate blank area BA2 is not determined as the division candidate position. When a character string is divided at a blank space between two characters (for example, “s” and “a”) that constitute one English word (for example, “said”), the user can Since there is a high possibility that the character string is difficult to read, in the present embodiment, the intermediate blank area BA2 is not determined as the division candidate position.

上記と同様に、３個目以降の各中間余白領域についても、当該中間余白領域が分断候補位置であるのか否かが決定される。また、修飾線が付されている文字列「Ｉｈａｖｅａｄｒｅａｍ．」に対応する連続範囲については、ゼロより大きい閾値以下の頻度を有する領域（例えば「Ｉ」と「ｈ」の間の領域）が中間余白領域として決定される（Ｓ２４０）。また、当該連続範囲については、当該閾値よりも大きい頻度を有する領域（例えば、「Ｉ」に対応する領域、「ｈ」に対応する領域等）が、隣接領域として決定される（Ｓ２４４）。このように、本実施例では、修飾線に対応する連続範囲について、ゼロより大きい閾値が決定されるので、修飾線を考慮して、中間余白領域と隣接領域とを適切に決定することができる。ケースＡでは、結果として、センテンス「ＩｓａｉｄＩｈａｖｅａｄｒｅａｍ．」について、５個の分断候補位置が決定される。 Similarly to the above, for each of the third and subsequent intermediate margin regions, whether or not the intermediate margin region is a division candidate position is determined. For a continuous range corresponding to the character string “I have a dream.” To which a modifier line is attached, a region having a frequency less than a threshold value greater than zero (for example, a region between “I” and “h”) Is determined as an intermediate margin area (S240). For the continuous range, a region having a frequency greater than the threshold (for example, a region corresponding to “I”, a region corresponding to “h”, etc.) is determined as an adjacent region (S244). As described above, in the present embodiment, the threshold value greater than zero is determined for the continuous range corresponding to the modification line, so that the intermediate blank area and the adjacent area can be appropriately determined in consideration of the modification line. . In case A, as a result, five division candidate positions are determined for the sentence “I said I have a dream.”.

なお、ケースＡでは、２個の英単語の間の余白が分断候補位置として決定される例を想定している。ただし、例えば、日本語の文章と文章との間に１文字分のスペースが挿入されている場合でも、当該スペースは、通常、Ｓ２４２でＮＯと判断され、分断候補位置として決定される（Ｓ２４６）。英語及び日本語とは異なる言語についても、比較的に大きい余白が存在する場合には、当該余白は、通常、分断候補位置として決定される。 In case A, an example is assumed in which a margin between two English words is determined as a division candidate position. However, for example, even when a space for one character is inserted between Japanese sentences, the space is usually determined as NO in S242 and determined as a division candidate position (S246). . Even in a language different from English and Japanese, when a relatively large margin exists, the margin is usually determined as a division candidate position.

ケースＢの結合画像は、日本語の文字列を含む。中間余白領域Ｂ５は、括弧Ｃ１と平仮名Ｃ２（即ち「あ」）の間の余白に相当し、通常、ｈ／４未満の横方向の長さを有する（Ｓ２４２でＹＥＳ）。また、右側隣接領域（即ち平仮名Ｃ２）は、通常、ｈ／２以上の横方向の長さを有するが、左側隣接領域（即ち括弧Ｃ１）は、通常、ｈ／２未満の横方向の長さを有する（Ｓ２４４でＹＥＳ）。従って、中間余白領域ＢＡ５が分断候補位置として決定されない。括弧と文字の間の余白で文字列が分断されて改行されると、ユーザが分断後の各文字列を読み難いと感じる可能性が高いので、本実施例では、中間余白領域ＢＡ５が分断候補位置として決定されない。 The combined image of case B includes a Japanese character string. The intermediate margin area B5 corresponds to a margin between the parenthesis C1 and the hiragana C2 (ie, “A”), and usually has a lateral length of less than h / 4 (YES in S242). Also, the right adjacent area (ie Hiragana C2) usually has a lateral length of h / 2 or more, while the left adjacent area (ie bracket C1) usually has a lateral length of less than h / 2. (YES in S244). Therefore, the middle blank area BA5 is not determined as the division candidate position. If the character string is divided at the blank space between the parenthesis and the character and the line is broken, it is highly likely that the user will find it difficult to read each character string after the division. Therefore, in this embodiment, the middle blank area BA5 is a candidate for division. Not determined as position.

中間余白領域ＢＡ６は、１個の平仮名Ｃ３（即ち「い」）を構成する左側の線と右側の線の間の余白に相当し、通常、ｈ／４未満の横方向の長さを有する（Ｓ２４２でＹＥＳ）。また、左側隣接領域（即ち平仮名Ｃ３を構成する左側の線）と右側隣接領域（即ち平仮名Ｃ３を構成する右側の線）とは、通常、ｈ／２未満の横方向の長さを有する（Ｓ２４４でＹＥＳ）。従って、中間余白領域ＢＡ６が分断候補位置として決定されない。１個の平仮名Ｃ３が分断されて改行されると、ユーザが１個の平仮名Ｃ３を認識することができないので、本実施例では、中間余白領域ＢＡ６が分断候補位置として決定されない。なお、平仮名「い」のみならず、平仮名Ｃ８〜Ｃ１０（即ち「け」、「に」、「は」）、片仮名Ｃ１１（即ち「ハ」）、漢字Ｃ１２（即ち「卵」）についても、１個の文字の間に余白が形成され得るが、当該余白も、通常、分断候補位置として決定されない（Ｓ２４４でＹＥＳ）。 The middle margin area BA6 corresponds to a margin between the left line and the right line constituting one hiragana C3 (ie, “I”), and generally has a lateral length of less than h / 4 ( YES in S242). Further, the left adjacent area (ie, the left line constituting Hiragana C3) and the right adjacent area (ie, the right line constituting Hiragana C3) usually have a lateral length of less than h / 2 (S244). YES) Therefore, the middle blank area BA6 is not determined as the division candidate position. When one hiragana C3 is divided and a line is broken, the user cannot recognize one hiragana C3, so in the present embodiment, the intermediate blank area BA6 is not determined as the division candidate position. Not only hiragana “I”, but also hiragana C8 to C10 (ie, “ke”, “ni”, “ha”), katakana C11 (ie, “ha”), and kanji C12 (ie, “egg”), 1 Although margins can be formed between individual characters, the margins are usually not determined as division candidate positions (YES in S244).

中間余白領域ＢＡ７は、平仮名Ｃ４（即ち「う」）と平仮名Ｃ５（即ち「え」）の間の余白に相当し、通常、ｈ／４未満の横方向の長さを有する（Ｓ２４２でＹＥＳ）。左側隣接領域（即ち平仮名Ｃ４）と右側隣接領域（即ち平仮名Ｃ５）とは、通常、ｈ／２以上の横方向の長さを有する（Ｓ２４４でＮＯ）。従って、中間余白領域ＢＡ７が分断候補位置として決定される（Ｓ２４６）。日本語の２個の文字の間の余白で文字列が分断されて改行されても、ユーザが分断後の各文字列を読み難いと感じる可能性が低いので、本実施例では、中間余白領域ＢＡ７が分断候補位置として決定される。 The intermediate margin area BA7 corresponds to a margin between the hiragana C4 (ie, “u”) and the hiragana C5 (ie, “e”), and usually has a lateral length of less than h / 4 (YES in S242). . The left adjacent area (ie Hiragana C4) and the right adjacent area (ie Hiragana C5) usually have a lateral length of h / 2 or more (NO in S244). Therefore, the middle blank area BA7 is determined as the division candidate position (S246). Even if a character string is divided at a blank space between two Japanese characters and a line is broken, it is unlikely that the user will find it difficult to read the divided character strings. BA7 is determined as the division candidate position.

中間余白領域ＢＡ８は、平仮名Ｃ６（即ち「お」）と句点Ｃ７（即ち「。」）の間の余白に相当し、通常、ｈ／４未満の横方向の長さを有する（Ｓ２４２でＹＥＳ）。左側隣接領域（即ち平仮名Ｃ６）は、通常、ｈ／２以上の横方向の長さを有するが、右側隣接領域（即ち句点Ｃ７）は、通常、ｈ／２未満の横方向の長さを有する（Ｓ２４４でＹＥＳ）。従って、中間余白領域ＢＡ８が分断候補位置として決定されない。文字と句点の間の余白で文字列が分断されて改行されると、ユーザが分断後の各文字列を読み難いと感じる可能性が高いので、本実施例では、中間余白領域ＢＡ８が分断候補位置として決定されない。なお、同様に、文字と読点（即ち「、」）の間の余白も、通常、分断候補位置として決定されない（Ｓ２４４でＹＥＳ）。 The intermediate margin area BA8 corresponds to a margin between the hiragana C6 (ie, “o”) and the punctuation point C7 (ie, “.”), And generally has a lateral length of less than h / 4 (YES in S242). . The left adjacent area (ie hiragana C6) typically has a lateral length of h / 2 or more, while the right adjacent area (ie phrase C7) typically has a lateral length of less than h / 2. (YES in S244). Therefore, the middle blank area BA8 is not determined as the division candidate position. If the character string is divided at the blank space between the character and the punctuation and the line is broken, it is highly likely that the user will find it difficult to read each character string after the division. In this embodiment, the middle blank area BA8 is a candidate for division. Not determined as position. Similarly, the margin between the character and the punctuation mark (that is, “,”) is not usually determined as the division candidate position (YES in S244).

（再配置処理；図９）
続いて、図９を参照して、図２のＳ４００で実行される再配置処理の内容を説明する。Ｓ４１０では、ＣＰＵ６２は、スキャン画像ＳＩ内の１個以上のテキスト領域のうちの１個のテキスト領域（例えばＴＯＡ）を処理対象として決定する。以下では、Ｓ４１０で処理対象として決定されるテキスト領域のことを「対象テキスト領域」と呼ぶ。また、対象テキスト領域について決定された目標領域（例えば図２のＳ３００のＴＡ）のことを「対象目標領域」と呼ぶ。また、対象テキスト領域に含まれる各文字列が結合された結合画像（例えば図２のＳ２００のＣＩ）、当該結合画像を表わす結合画像データのことを、それぞれ、「対象結合画像」、「対象結合画像データ」と呼ぶ。 (Relocation processing; FIG. 9)
Next, the contents of the rearrangement process executed in S400 of FIG. 2 will be described with reference to FIG. In S410, the CPU 62 determines one text area (for example, TOA) out of one or more text areas in the scanned image SI as a processing target. Hereinafter, the text area determined as the processing target in S410 is referred to as “target text area”. A target area determined for the target text area (for example, TA in S300 in FIG. 2) is referred to as a “target target area”. Further, a combined image (for example, CI of S200 in FIG. 2) in which each character string included in the target text area is combined, and combined image data representing the combined image are respectively referred to as “target combined image” and “target combined image”. It is called “image data”.

Ｓ４２０では、ＣＰＵ６２は、決定されるべき再配置領域（図２のＳ４００のＲＡ参照）の候補である候補再配置領域の横方向の長さＷ（即ち横方向の画素数Ｗ）の初期値、縦方向の長さＨ（即ち縦方向の画素数Ｈ）の初期値として、それぞれ、対象テキスト領域の横方向の長さＯＰｘ、縦方向の長さＯＰｙを設定する。 In S420, the CPU 62 determines the initial value of the horizontal length W (that is, the number of pixels W in the horizontal direction) of the candidate rearrangement region that is a candidate for the rearrangement region to be determined (see RA in S400 in FIG. 2). As the initial value of the vertical length H (that is, the number of vertical pixels H), the horizontal length OPx and the vertical length OPy of the target text area are set, respectively.

Ｓ４３０では、ＣＰＵ６２は、候補再配置領域の縦方向の長さＨに対する横方向の長さＷの比Ｗ／Ｈが、対象目標領域の縦方向の長さＴＨに対する横方向の長さＴＷの比ＴＷ／ＴＨ未満であるのか否かを判断する。 In S430, the CPU 62 determines that the ratio W / H of the horizontal length W to the vertical length H of the candidate rearrangement area is the ratio of the horizontal length TW to the vertical length TH of the target target area. It is determined whether it is less than TW / TH.

ＣＰＵ６２は、比Ｗ／Ｈが比ＴＷ／ＴＨ未満であると判断する場合（Ｓ４３０でＹＥＳ）には、Ｓ４３２において、候補再配置領域の横方向の現在の長さＷに予め決められている固定値β（例えば１画素）を加算して、候補再配置領域の横方向の新たな長さＷを決定する。Ｓ４３２が終了すると、Ｓ４４０に進む。 When the CPU 62 determines that the ratio W / H is less than the ratio TW / TH (YES in S430), in S432, the fixed length predetermined to the current length W in the horizontal direction of the candidate rearrangement region is determined. The value β (for example, one pixel) is added to determine a new lateral length W of the candidate rearrangement region. When S432 ends, the process proceeds to S440.

一方、ＣＰＵ６２は、比Ｗ／Ｈが比ＴＷ／ＴＨ以上であると判断する場合（Ｓ４３０でＮＯ）には、Ｓ４３４において、候補再配置領域の横方向の現在の長さＷから予め決められている固定値β（例えば１画素）を減算して、候補再配置領域の横方向の新たな長さＷを決定する。Ｓ４３４が終了すると、Ｓ４４０に進む。なお、本実施例では、Ｓ４３２及びＳ４３４において、同じ固定値βが利用されるが、変形例では、Ｓ４３２の固定値とＳ４３４の固定値とは異なる値であってもよい。 On the other hand, if the CPU 62 determines that the ratio W / H is greater than or equal to the ratio TW / TH (NO in S430), the CPU 62 determines in advance from the current length W in the horizontal direction of the candidate rearrangement region in S434. A fixed length β (for example, one pixel) is subtracted to determine a new horizontal length W of the candidate rearrangement region. When S434 ends, the process proceeds to S440. In the present embodiment, the same fixed value β is used in S432 and S434. However, in a modified example, the fixed value in S432 and the fixed value in S434 may be different values.

Ｓ４４０では、ＣＰＵ６２は、スキャン画像データＳＩＤの解像度に応じて、縦方向に沿った行間の長さｍ（即ち行間の画素数ｍ）を決定する。例えば、ＣＰＵ６２は、スキャン画像データＳＩＤの解像度が３００ｄｐｉである場合には、行間の長さｍとして１画素を決定し、スキャン画像データＳＩＤの解像度が６００ｄｐｉである場合には、行間の長さｍとして２画素を決定する。即ち、ＣＰＵ６２は、スキャン画像データＳＩＤの解像度が高くなる程、大きい行間の長さｍを決定する。この構成によると、ＣＰＵ６２は、スキャン画像データＳＩＤの解像度に応じた適切な大きさを有する行間の長さｍを決定することができる。なお、変形例では、スキャン画像データＳＩＤの解像度に関わらず、行間の長さｍとして同じ値が採用されてもよい。 In S440, the CPU 62 determines the length m between rows along the vertical direction (that is, the number m of pixels between rows) according to the resolution of the scan image data SID. For example, when the resolution of the scanned image data SID is 300 dpi, the CPU 62 determines one pixel as the length m between the rows, and when the resolution of the scanned image data SID is 600 dpi, the length m between the rows. 2 pixels are determined. That is, the CPU 62 determines a larger line length m as the resolution of the scanned image data SID increases. According to this configuration, the CPU 62 can determine the length m between lines having an appropriate size according to the resolution of the scanned image data SID. In the modified example, the same value may be adopted as the length m between lines regardless of the resolution of the scanned image data SID.

Ｓ４５０では、ＣＰＵ６２は、対象結合画像データと、Ｓ４３２又はＳ４３４で決定された候補再配置領域の横方向の新たな長さＷと、に基づいて、行数決定処理を実行する（後述の図１０参照）。行数決定処理では、ＣＰＵ６２は、対象結合画像（例えば図２のＣＩ）に含まれる複数個の文字（例えば「Ａ〜Ｍ」）を候補再配置領域内に再配置する場合における行数を決定する。 In S450, the CPU 62 executes a row number determination process based on the target combined image data and the new horizontal length W of the candidate rearrangement region determined in S432 or S434 (FIG. 10 described later). reference). In the line number determination process, the CPU 62 determines the number of lines when rearranging a plurality of characters (for example, “A to M”) included in the target combined image (for example, CI in FIG. 2) in the candidate rearrangement region. To do.

（行数決定処理；図１０）
図１０に示されるように、Ｓ４５１では、ＣＰＵ６２は、対象結合画像（例えば図１０内のＣＩ）の横方向の長さＩＷが、候補再配置領域の横方向の長さＷ以下であるのか否かを判断する。ＣＰＵ６２は、長さＩＷが長さＷ以下であると判断する場合（Ｓ４５１でＹＥＳ）には、Ｓ４５２において、「１」を行数として決定する。対象結合画像ＣＩに含まれる全ての文字「Ａ〜Ｍ」が横方向に沿って直線状に並んだ状態で、全ての文字「Ａ〜Ｍ」が候補再配置領域内に収まるからである。Ｓ４５２が終了すると、図１０の処理が終了する。 (Line number determination processing; FIG. 10)
As illustrated in FIG. 10, in S451, the CPU 62 determines whether or not the horizontal length IW of the target combined image (for example, CI in FIG. 10) is equal to or smaller than the horizontal length W of the candidate rearrangement region. Determine whether. If the CPU 62 determines that the length IW is less than or equal to the length W (YES in S451), the CPU 62 determines “1” as the number of rows in S452. This is because all the characters “A to M” are included in the candidate rearrangement region in a state where all the characters “A to M” included in the target combined image CI are arranged linearly along the horizontal direction. When S452 ends, the process of FIG. 10 ends.

一方、ＣＰＵ６２は、長さＩＷが長さＷより大きいと判断する場合（Ｓ４５１でＮＯ）には、対象結合画像ＣＩに含まれる複数個の文字「Ａ〜Ｍ」を複数行に分断して配置する必要がある。このために、ＣＰＵ６２は、Ｓ４５３及び４５４を実行して、図６のＳ２３０で決定された複数個の分断候補位置（例えば図１０内の対象結合画像ＣＩに付された複数個の矢印参照）の中から、１個以上の分断候補位置を選択する。 On the other hand, when the CPU 62 determines that the length IW is greater than the length W (NO in S451), the CPU 62 divides and arranges the plurality of characters “A to M” included in the target combined image CI into a plurality of lines. There is a need to. For this purpose, the CPU 62 executes S453 and 454 to determine a plurality of division candidate positions determined in S230 of FIG. 6 (for example, refer to a plurality of arrows attached to the target combined image CI in FIG. 10). One or more division candidate positions are selected from the inside.

Ｓ４５３では、ＣＰＵ６２は、選択長さＳＷが候補再配置領域の横方向の長さＷ以下の最大の長さになるように、複数個の分断候補位置の中から１個の分断候補位置を選択する。１個の分断候補位置も未だに選択されていない状態では、選択長さＳＷは、対象結合画像ＣＩの先端（即ち左端）と、選択されるべき分断候補位置と、の間の横方向の長さである。また、１個以上の分断候補位置が既に選択されている状態では、選択長さＳＷは、直近に選択された分断候補位置と、当該分断候補位置よりも後端側（即ち右側）に存在する新たに選択されるべき分断候補位置と、の間の横方向の長さである。図１０の例では、文字「Ｆ」と文字「Ｇ」との間の分断候補位置が選択される。 In S453, the CPU 62 selects one division candidate position from among a plurality of division candidate positions so that the selection length SW becomes the maximum length not more than the horizontal length W of the candidate rearrangement area. To do. In a state where one division candidate position has not yet been selected, the selection length SW is the horizontal length between the leading end (ie, the left end) of the target combined image CI and the division candidate position to be selected. It is. In a state where one or more division candidate positions have already been selected, the selection length SW is present at the most recently selected division candidate position and the rear end side (that is, the right side) of the division candidate position. This is the length in the horizontal direction between the division candidate position to be newly selected. In the example of FIG. 10, a division candidate position between the character “F” and the character “G” is selected.

Ｓ４５４では、ＣＰＵ６２は、残存長さＲＷが候補再配置領域の横方向の長さＷ以下であるのか否かを判断する。残存長さＲＷは、直近に選択された分断候補位置と、対象結合画像の後端と、の間の横方向の長さである。ＣＰＵ６２は、残存長さＲＷが長さＷよりも大きいと判断する場合（Ｓ４５４でＮＯ）には、Ｓ４５３に戻り、複数個の分断候補位置の中から。直近に選択された分断候補位置よりも後端側に存在する分断候補位置を新たに決定する。 In S454, the CPU 62 determines whether or not the remaining length RW is less than or equal to the horizontal length W of the candidate rearrangement region. The remaining length RW is the length in the horizontal direction between the most recently selected division candidate position and the rear end of the target combined image. If the CPU 62 determines that the remaining length RW is greater than the length W (NO in S454), the CPU 62 returns to S453 and selects from among a plurality of division candidate positions. A division candidate position that is present on the rear end side from the most recently selected division candidate position is newly determined.

一方、ＣＰＵ６２は、残存長さＲＷが長さＷ以下であると判断する場合（Ｓ４５４でＹＥＳ）には、Ｓ４５５において、選択済みの分断候補位置の数に「１」を加算することによって得られる数を行数として決定する。Ｓ４５５が終了すると、図１０の処理が終了する。 On the other hand, if the CPU 62 determines that the remaining length RW is less than or equal to the length W (YES in S454), it is obtained by adding “1” to the number of selected division candidate positions in S455. Determine the number as the number of rows. When S455 ends, the process of FIG. 10 ends.

（再配置処理の続き；図９）
図９のＳ４６０では、ＣＰＵ６２は、Ｓ４６０内の数式に従って、候補再配置領域の縦方向の新たな長さＨを決定する。Ｓ４６０内の数式において、「ｍ」はＳ４４０で決定された行間の長さであり、「ｎ」はＳ４５０で決定された行数であり、「ｈ」は対象結合画像データの縦方向の長さである（図１０内の結合画像ＣＩのｈ参照）。 (Continuation of rearrangement processing; FIG. 9)
In S460 of FIG. 9, the CPU 62 determines a new length H in the vertical direction of the candidate rearrangement region according to the formula in S460. In the mathematical expression in S460, “m” is the length between lines determined in S440, “n” is the number of lines determined in S450, and “h” is the length in the vertical direction of the target combined image data. (Refer to h of the combined image CI in FIG. 10).

Ｓ４７０では、ＣＰＵ６２は、候補再配置領域のアスペクト比Ｗ／Ｈが対象目標領域のアスペクト比ＴＷ／ＴＨに近似するのか否かを判断する。具体的には、ＣＰＵ６２は、候補再配置領域のアスペクト比Ｗ／Ｈが、対象目標領域のアスペクト比ＴＷ／ＴＨに基づいて設定される所定範囲内に含まれるのか否かを判断する。上記の所定範囲は、対象目標領域のアスペクト比ＴＷ／ＴＨから値γを減算することによって得られる値と、対象目標領域のアスペクト比ＴＷ／ＴＨに値γを加算することによって得られる値と、の間の範囲である。なお、値γは、予め決められている固定値であってもよいし、ＴＷ／ＴＨに所定の係数（例えば０．０５）を乗算することによって得られる値であってもよい。 In S470, the CPU 62 determines whether or not the aspect ratio W / H of the candidate rearrangement area approximates the aspect ratio TW / TH of the target target area. Specifically, the CPU 62 determines whether or not the aspect ratio W / H of the candidate rearrangement area is included within a predetermined range set based on the aspect ratio TW / TH of the target target area. The predetermined range includes a value obtained by subtracting the value γ from the aspect ratio TW / TH of the target target area, a value obtained by adding the value γ to the aspect ratio TW / TH of the target target area, The range between. Note that the value γ may be a predetermined fixed value, or may be a value obtained by multiplying TW / TH by a predetermined coefficient (for example, 0.05).

ＣＰＵ６２は、候補再配置領域のアスペクト比Ｗ／Ｈが対象目標領域のアスペクト比ＴＷ／ＴＨに近似しないと判断する場合（Ｓ４７０でＮＯ）には、Ｓ４３０〜Ｓ４６０の各処理を再び実行する。これにより、ＣＰＵ６２は、候補再配置領域の横方向の新たな長さＷと縦方向の新たな長さＨとを決定して、Ｓ４７０の判断を再び実行する。 If the CPU 62 determines that the aspect ratio W / H of the candidate rearrangement area does not approximate the aspect ratio TW / TH of the target target area (NO in S470), the CPU 62 executes each process of S430 to S460 again. Thereby, the CPU 62 determines a new length W in the horizontal direction and a new length H in the vertical direction of the candidate rearrangement region, and executes the determination in S470 again.

一方、ＣＰＵ６２は、候補再配置領域のアスペクト比Ｗ／Ｈが対象目標領域のアスペクト比ＴＷ／ＴＨに近似すると判断する場合（Ｓ４７０でＹＥＳ）には、Ｓ４８０において、まず、横方向の長さＷと縦方向の長さＨとを有する候補再配置領域を再配置領域（例えば図２のＲＡ）として決定する。そして、ＣＰＵ６２は、図１０のＳ４５３で１個以上の分断候補位置を選択済みである場合には、当該１個以上の分断候補位置で対象結合画像データを分断して、２個以上の分断画像を表わす２個以上の分断画像データを生成する。次いで、ＣＰＵ６２は、２個以上の分断画像が縦方向に沿って並ぶように、２個以上の分断画像データを再配置領域内に配置する。この際に、ＣＰＵ６２は、縦方向に沿って隣接する２個の分断画像の間にＳ４４０で決定された行間が形成されるように、２個の分断画像データを配置する。この結果、例えば、図２のＳ４００に示されるように、複数個の文字「Ａ」〜「Ｍ」が再配置領域ＲＡ内に再配置されている再配置画像ＲＩを表わす再配置画像データが生成される。再配置画像ＲＩ内の複数個の文字「Ａ」〜「Ｍ」のサイズは、スキャン画像ＳＩ内の複数個の文字「Ａ」〜「Ｍ」のサイズに等しい。 On the other hand, if the CPU 62 determines that the aspect ratio W / H of the candidate rearrangement area is close to the aspect ratio TW / TH of the target target area (YES in S470), first, in S480, the horizontal length W And a candidate rearrangement area having a vertical length H are determined as rearrangement areas (for example, RA in FIG. 2). Then, when one or more division candidate positions have been selected in S453 of FIG. 10, the CPU 62 divides the target combined image data at the one or more division candidate positions, and two or more division images. Two or more pieces of divided image data representing are generated. Next, the CPU 62 arranges the two or more divided image data in the rearrangement region so that the two or more divided images are arranged in the vertical direction. At this time, the CPU 62 arranges the two pieces of divided image data so that the line spacing determined in S440 is formed between the two divided images adjacent in the vertical direction. As a result, for example, as shown in S400 of FIG. 2, rearranged image data representing a rearranged image RI in which a plurality of characters “A” to “M” are rearranged in the rearrangement region RA is generated. Is done. The sizes of the plurality of characters “A” to “M” in the rearranged image RI are equal to the sizes of the plurality of characters “A” to “M” in the scanned image SI.

Ｓ４９０では、ＣＰＵ６２は、全てのテキスト領域について、Ｓ４１０〜Ｓ４８０の処理が終了したのか否かを判断する。ＣＰＵ６２は、処理が終了していないと判断する場合（Ｓ４９０でＮＯ）には、Ｓ４１０において、未処理のテキスト領域を処理対象として決定して、Ｓ４２０以降の各処理を再び実行する。そして、ＣＰＵ６２は、処理が終了したと判断する場合（Ｓ４９０でＹＥＳ）には、図９の処理を終了する。 In S490, the CPU 62 determines whether or not the processing of S410 to S480 has been completed for all text areas. If the CPU 62 determines that the process has not ended (NO in S490), the CPU 62 determines an unprocessed text area as a process target in S410, and executes each process after S420 again. Then, if the CPU 62 determines that the process has been completed (YES in S490), the process of FIG. 9 is terminated.

（具体的なケース；図１１）
続いて、図１１を参照して、図２のＳ４００の再配置処理（図９参照）とＳ５００の拡大処理について、具体的なケースを説明する。（１）に示されるように、候補再配置領域の横方向の長さＷの初期値、縦方向の長さＨの初期値として、それぞれ、対象テキスト領域ＴＯＡの横方向の長さＯＰｘ、縦方向の長さＯＰｙが設定される（図９のＳ４２０）。本ケースでは、Ｗ／ＨがＴＷ／ＴＨ未満である。即ち、対象目標領域ＴＡは、対象テキスト領域ＴＯＡと比べると、横長の形状を有する。この場合、候補再配置領域を横長の形状にしていけば、候補再配置領域のアスペクト比が対象目標領域ＴＡのアスペクト比に近づくことになる。従って、（２）に示されるように、候補再配置領域の横方向の現在の長さＷに固定値βが加算されて、候補再配置領域の横方向の新たな長さＷが決定される（Ｓ４３２）。この場合、行数として、文字列「Ａ〜Ｅ」と文字列「Ｆ〜Ｊ」と文字列「Ｋ〜Ｍ」とを含む３行が決定される（Ｓ４５０）。そして、候補再配置領域の縦方向の新たな長さＨが決定される（Ｓ４６０）。 (Specific case; Fig. 11)
Next, a specific case will be described with reference to FIG. 11 for the rearrangement process in S400 (see FIG. 9) in FIG. 2 and the enlargement process in S500. As shown in (1), as the initial value of the horizontal length W and the initial value of the vertical length H of the candidate rearrangement area, the horizontal length OPx and the vertical length of the target text area TOA, respectively. The direction length OPy is set (S420 in FIG. 9). In this case, W / H is less than TW / TH. That is, the target target area TA has a horizontally long shape as compared with the target text area TOA. In this case, if the candidate rearrangement area has a horizontally long shape, the aspect ratio of the candidate rearrangement area approaches the aspect ratio of the target target area TA. Accordingly, as shown in (2), the fixed value β is added to the current length W in the horizontal direction of the candidate rearrangement region to determine a new length W in the horizontal direction of the candidate rearrangement region. (S432). In this case, three lines including the character string “A to E”, the character string “F to J”, and the character string “K to M” are determined as the number of lines (S450). Then, a new vertical length H of the candidate rearrangement region is determined (S460).

（２）の状態では、候補再配置領域のアスペクト比Ｗ／Ｈが対象目標領域ＴＡのアスペクト比ＴＷ／ＴＨに近似しないので（Ｓ４７０でＮＯ）、（３）に示されるように、候補再配置領域の横方向の現在の長さＷに固定値βが再び加算されて、候補再配置領域の横方向の新たな長さＷが再び決定される（Ｓ４３２）。この場合、行数として、文字列「Ａ〜Ｆ」と文字列「Ｇ〜Ｌ」と文字列「Ｍ」とを含む３行が決定される（Ｓ４５０）。即ち、候補再配置領域の横方向の長さＷが大きくなったことに起因して、候補再配置領域内の１行の文字列を構成することが可能な最大の文字数が増える。そして、候補再配置領域の縦方向の新たな長さＨが決定される（Ｓ４６０）。 In the state of (2), since the aspect ratio W / H of the candidate rearrangement area does not approximate the aspect ratio TW / TH of the target target area TA (NO in S470), as shown in (3), the candidate rearrangement The fixed value β is added again to the current length W in the horizontal direction of the area, and the new horizontal length W of the candidate rearrangement area is determined again (S432). In this case, three lines including the character string “A to F”, the character string “G to L”, and the character string “M” are determined as the number of lines (S450). That is, the maximum number of characters that can form one line of character string in the candidate rearrangement area increases due to the increase in the horizontal length W of the candidate rearrangement area. Then, a new vertical length H of the candidate rearrangement region is determined (S460).

（３）の状態では、候補再配置領域のアスペクト比Ｗ／Ｈが対象目標領域ＴＡのアスペクト比ＴＷ／ＴＨに近似する（Ｓ４７０でＹＥＳ）。従って、（４）に示されるように、（３）の候補再配置領域が再配置領域ＲＡとして決定される（Ｓ４８０）。次いで、対象結合画像ＣＩを表わす対象結合画像データが分断されて、３個の分断画像ＤＩ１〜ＤＩ３を表わす３個の分断画像データが生成される（Ｓ４８０）。そして、３個の分断画像ＤＩ１〜ＤＩ３が縦方向に沿って並び、かつ、隣接する２個の分断画像の間に長さｍを有する行間が形成されるように、３個の分断画像データが再配置領域ＲＡ内に配置される。この結果、再配置画像ＲＩを表わす再配置画像データが生成される（Ｓ４８０）。 In the state (3), the aspect ratio W / H of the candidate rearrangement area approximates the aspect ratio TW / TH of the target target area TA (YES in S470). Therefore, as shown in (4), the candidate rearrangement region in (3) is determined as the rearrangement region RA (S480). Next, the target combined image data representing the target combined image CI is divided, and three divided image data representing the three divided images DI1 to DI3 are generated (S480). Then, the three divided image data are arranged so that three divided images DI1 to DI3 are arranged in the vertical direction and a line space having a length m is formed between two adjacent divided images. Arranged in the rearrangement area RA. As a result, rearranged image data representing the rearranged image RI is generated (S480).

次いで、再配置画像データが拡大されて、拡大画像を表わす拡大画像データが生成される（図２のＳ５００）。具体的には、再配置画像ＲＩの対角線が伸びる方向に再配置画像ＲＩが拡大され、その結果、拡大画像を表わす拡大画像データが生成される。例えば、再配置領域ＲＡのアスペクト比Ｗ／Ｈが対象目標領域ＴＡのアスペクト比ＴＷ／ＴＨに等しい場合には、拡大画像の４個の辺の全てが、対象目標領域ＴＡの４個の辺に一致する。即ち、この場合、拡大画像のサイズが目標領域ＴＡのサイズに一致する。ただし、例えば、再配置領域ＲＡのアスペクト比Ｗ／Ｈが対象目標領域ＴＡのアスペクト比ＴＷ／ＴＨに等しくない場合には、再配置画像ＲＩを徐々に拡大していく過程において、拡大画像のいずれかの辺が対象目標領域ＴＡのいずれかの辺に一致した段階で、再配置画像ＲＩの拡大が終了する。即ち、この場合、拡大画像のサイズが目標領域ＴＡのサイズよりも小さくなる。 Next, the rearranged image data is enlarged, and enlarged image data representing the enlarged image is generated (S500 in FIG. 2). Specifically, the rearranged image RI is enlarged in the direction in which the diagonal line of the rearranged image RI extends, and as a result, enlarged image data representing the enlarged image is generated. For example, when the aspect ratio W / H of the rearrangement area RA is equal to the aspect ratio TW / TH of the target target area TA, all four sides of the enlarged image are in the four sides of the target target area TA. Match. In other words, in this case, the size of the enlarged image matches the size of the target area TA. However, if, for example, the aspect ratio W / H of the rearrangement area RA is not equal to the aspect ratio TW / TH of the target target area TA, any of the enlarged images in the process of gradually expanding the rearrangement image RI. The enlargement of the rearranged image RI is completed at the stage when these sides coincide with any side of the target target area TA. That is, in this case, the size of the enlarged image is smaller than the size of the target area TA.

続いて、（５）に示されるように、再配置画像ＲＩを表わす再配置画像データが拡大された拡大画像データが、スキャン画像データＳＩＤの目標領域ＴＡ内に上書きされる（図２のＳ５００）。この結果、処理済み画像ＰＩを表わす処理済み画像データＰＩＤが完成する。 Subsequently, as shown in (5), the enlarged image data obtained by enlarging the rearranged image data representing the rearranged image RI is overwritten in the target area TA of the scanned image data SID (S500 in FIG. 2). . As a result, processed image data PID representing the processed image PI is completed.

図１１に示されるように、（１）のスキャン画像ＳＩでは、修飾線は、文字列「ＦＧＨ」の近傍に配置されている。そして、文字列「Ｆ〜Ｊ」と修飾線が１行の修飾文字列として扱われるので、結合画像ＣＩでも、修飾線は、文字列「ＦＧＨ」の近傍に配置されている。即ち、スキャン画像ＳＩ内での文字列「ＦＧＨ」と修飾線との修飾関係が維持されている結合画像ＣＩが得られる。また、結合画像ＣＩから（４）の再配置画像ＲＩが生成され、再配置画像ＲＩから（５）の処理済み画像ＰＩが生成されるので、再配置画像ＲＩ及び処理済み画像ＰＩでも、修飾線は、文字列「ＦＧＨ」の近傍に配置されている。即ち、スキャン画像ＳＩ内での文字列「ＦＧＨ」と修飾線との修飾関係が維持されている再配置画像ＲＩ及び処理済み画像ＰＩが得られる。 As shown in FIG. 11, in the scan image SI of (1), the modification line is arranged in the vicinity of the character string “FGH”. Since the character string “F to J” and the modification line are handled as one line of the modification character string, the modification line is arranged in the vicinity of the character string “FGH” even in the combined image CI. That is, a combined image CI is obtained in which the modification relationship between the character string “FGH” and the modification line in the scan image SI is maintained. In addition, since the rearranged image RI of (4) is generated from the combined image CI and the processed image PI of (5) is generated from the rearranged image RI, the rearranged image RI and the processed image PI are also modified. Are arranged in the vicinity of the character string “FGH”. In other words, the rearranged image RI and the processed image PI are obtained in which the modification relationship between the character string “FGH” and the modification line in the scan image SI is maintained.

また、（１）のスキャン画像ＳＩでは、符号ｈ１、符号ｈ２は、それぞれ、修飾線が付されている文字列「ＦＧＨ」の縦方向の長さ、文字列「ＦＧＨ」と修飾線との間の縦方向の長さ（以下では単に「間隔の長さ」と呼ぶ）である。文字列「Ｆ〜Ｊ」と修飾線が１行の修飾文字列として扱われるので、結合画像ＣＩでも、文字列「ＦＧＨ」の縦方向の長さ、間隔の長さは、それぞれ、ｈ１、ｈ２である。同様に、（４）の再配置画像ＲＩでも、文字列「ＦＧＨ」の縦方向の長さ、間隔の長さは、それぞれ、ｈ１、ｈ２である。また、（５）の拡大画像ＰＩでは、拡大された文字列「ＦＧＨ」の縦方向の長さ、拡大された間隔の長さは、それぞれ、ｈ１×ｔ、ｈ２×ｔである（ｔは拡大倍率である）。従って、いずれの画像ＳＩ，ＣＩ，ＲＩ，ＰＩにおいても、文字列「ＦＧＨ」の縦方向の長さに対する間隔の長さの比は、ｈ２／ｈ１である。即ち、各画像ＳＩ，ＣＩ，ＲＩ，ＰＩにおいて、文字列「ＦＧＨ」の縦方向の長さに対する間隔の長さの比は等しい。 Further, in the scanned image SI of (1), the symbols h1 and h2 respectively indicate the length in the vertical direction of the character string “FGH” to which the modification line is attached, and between the character string “FGH” and the modification line. In the vertical direction (hereinafter simply referred to as “interval length”). Since the character string “F to J” and the modification line are treated as one line of the modification character string, even in the combined image CI, the length in the vertical direction and the length of the interval of the character string “FGH” are h1 and h2, respectively. It is. Similarly, in the rearranged image RI of (4), the length of the character string “FGH” in the vertical direction and the length of the interval are h1 and h2, respectively. In the enlarged image PI of (5), the length of the enlarged character string “FGH” in the vertical direction and the length of the enlarged interval are h1 × t and h2 × t, respectively (t is an enlarged image). Magnification). Accordingly, in any of the images SI, CI, RI, and PI, the ratio of the length of the interval to the length of the character string “FGH” in the vertical direction is h2 / h1. That is, in each image SI, CI, RI, PI, the ratio of the length of the interval to the length in the vertical direction of the character string “FGH” is equal.

（第１実施例の効果）
本実施例によると、図５に示されるように、画像処理サーバ５０は、４個の帯状領域ＬＡ１１〜ＬＡ１４の縦方向に沿った４個の長さｈ１１〜ｈ１４に基づいて、４個の帯状領域ＬＡ１１〜ＬＡ１４の中から修飾物帯状領域ＬＡ１３を特定する（Ｓ１８４でＹＥＳ）。そして、画像処理サーバ５０は、修飾物帯状領域ＬＡ１３を文字列帯状領域ＬＡ１２に結合して、修飾文字列帯状領域ＬＡ１２’を決定する（Ｓ１９５）。これにより、画像処理サーバ５０は、文字列帯状領域ＬＡ１２内の文字列「Ｆ〜Ｊ」と、修飾物帯状領域ＬＡ１３内の修飾線と、を１行の修飾文字列として扱うことができる。この結果、図６に示されるように、画像処理サーバ５０は、文字列「ＦＧＨ」が修飾線によって修飾されている結合文字列「Ａ〜Ｍ」を含む結合画像ＣＩを表わす結合画像データを生成することができる。特に、画像処理サーバ５０は、２個以上の文字を含み得る帯状領域を単位として処理を実行して、結合画像データを生成する。従って、画像処理サーバ５０は、スキャン画像ＳＩ内の１個の文字を単位として処理を実行せずに済み、この結果、結合画像データを迅速に生成することができる。このように、画像処理サーバ５０は、スキャン画像ＳＩ内での文字列「ＦＧＨ」と修飾線との修飾関係が維持されている結合画像ＣＩを表わす結合画像データを迅速に生成することができる。そして、図１１に示されるように、画像処理サーバ５０は、結合画像データを利用して再配置画像データ及び処理済み画像データＰＩＤを生成するので、スキャン画像ＳＩ内での文字列「ＦＧＨ」と修飾線との修飾関係が維持されている再配置画像ＲＩ及び処理済み画像ＰＩを表わす再配置画像データ及び処理済み画像データＰＩＤを適切に生成することができる。 (Effects of the first embodiment)
According to the present embodiment, as shown in FIG. 5, the image processing server 50 has four strips based on the four lengths h11 to h14 along the vertical direction of the four strip regions LA11 to LA14. From the areas LA11 to LA14, the modified band-like area LA13 is specified (YES in S184). Then, the image processing server 50 determines the modified character string band area LA12 ′ by combining the modified band area LA13 with the character string band area LA12 (S195). Thereby, the image processing server 50 can handle the character string “F to J” in the character string strip area LA12 and the modification line in the modifier strip area LA13 as one line of the modified character string. As a result, as shown in FIG. 6, the image processing server 50 generates combined image data representing the combined image CI including the combined character string “A to M” in which the character string “FGH” is modified by the modifier line. can do. In particular, the image processing server 50 executes processing in units of band-like regions that can include two or more characters, and generates combined image data. Therefore, the image processing server 50 does not need to execute the process for one character in the scanned image SI, and as a result, the combined image data can be generated quickly. As described above, the image processing server 50 can quickly generate combined image data representing the combined image CI in which the modification relationship between the character string “FGH” and the modification line in the scan image SI is maintained. Then, as shown in FIG. 11, the image processing server 50 generates rearranged image data and processed image data PID using the combined image data, so that the character string “FGH” in the scanned image SI is The rearranged image RI and the processed image data PID representing the rearranged image RI and the processed image PI in which the modification relationship with the modification line is maintained can be appropriately generated.

（対応関係）
画像処理サーバ５０が、「画像処理装置」の一例である。スキャン画像ＳＩ、結合画像ＣＩが、それぞれ、「原画像」、「対象画像」の一例である。図１１では、スキャン画像ＳＩにおいて、３行の文字列「Ａ〜Ｍ」、文字列「ＦＧＨ」、修飾線が、それぞれ、「Ｍ行の文字列」、「被修飾文字」、「修飾物」の一例である。図５では、４個の帯状領域ＬＡ１１〜ＬＡ１４、３個の帯状領域ＬＡ１１，ＬＡ１２，ＬＡ１４、１個の帯状領域ＬＡ１３が、それぞれ、「複数個の帯状領域」、「Ｍ個の主帯状領域」、「副帯状領域」の一例である。また、帯状領域ＬＡ１２が、「近傍主帯状領域」の一例である。図５のＳ１８２の閾値Ｔｈが、「設定値」の一例である。図４では、Ｓ１６２で生成される射影ヒストグラム、Ｓ１７０で算出される合計下辺長さが、それぞれ、「第２の射影ヒストグラム」、「評価値」の一例である。また、横方向、左側、右側が、それぞれ、「第１方向」、「第１方向の第１側」、「第１方向の第２側」の一例である。縦方向、上側、下側が、それぞれ、「第２方向」、「第２方向の第１側」、「第２方向の第２側」の一例である。 (Correspondence)
The image processing server 50 is an example of an “image processing apparatus”. The scanned image SI and the combined image CI are examples of “original image” and “target image”, respectively. In FIG. 11, in the scanned image SI, three lines of character strings “A to M”, character string “FGH”, and modifier lines are “M line of character string”, “character to be modified”, and “modifier”, respectively. It is an example. In FIG. 5, four belt-like regions LA11 to LA14, three belt-like regions LA11, LA12, LA14, and one belt-like region LA13 are respectively “a plurality of belt-like regions” and “M main belt-like regions”. , Is an example of a “sub-band region”. Further, the band-shaped area LA12 is an example of a “near main band-shaped area”. The threshold value Th in S182 of FIG. 5 is an example of “set value”. In FIG. 4, the projection histogram generated in S162 and the total lower side length calculated in S170 are examples of the “second projection histogram” and the “evaluation value”, respectively. The horizontal direction, the left side, and the right side are examples of “first direction”, “first side in the first direction”, and “second side in the first direction”, respectively. The vertical direction, upper side, and lower side are examples of “second direction”, “first side in second direction”, and “second side in second direction”, respectively.

（第２実施例；図１２）
本実施例では、図５のＳ１８１及びＳ１８２の処理に代えて、図１２のＳ１８１及びＳ１８２の処理が実行される。即ち、本実施例では、閾値Ｔｈを設定するための手法が第１実施例とは異なる。 (Second embodiment; FIG. 12)
In the present embodiment, the processes of S181 and S182 of FIG. 12 are executed instead of the processes of S181 and S182 of FIG. That is, in the present embodiment, the method for setting the threshold Th is different from that in the first embodiment.

Ｓ１８１では、ＣＰＵ６２は、長さ（即ち画素数）を示す横軸と出現頻度を示す縦軸とによって画定される平面上に、４個の帯状領域ＬＡ１１〜ＬＡ１４の４個の長さｈ１１〜ｈ１４の出現頻度を示す各点を上記の平面上にプロットして、当該各点を直線で結んだグラフ（即ち図１２内のグラフ）を生成する。当該グラフでは、３個の範囲Ｒ１〜Ｒ３が得られる。範囲Ｒ１は、１個の修飾物帯状領域ＬＡ１３の１個の長さｈ１３の出現頻度（即ち「１」）を示す点がピークを構成する範囲である。範囲Ｒ２は、出現頻度ゼロを示す範囲である。範囲Ｒ３は、３個の文字列帯状領域ＬＡ１１，ＬＡ１２，ＬＡ１４の３個の長さｈ１１，ｈ１２，ｈ１４の出現頻度（即ち「３」）を示す点がピークを構成する範囲である。より具体的には、範囲Ｒ３は、最高の出現頻度（即ち「３」）を含む範囲である。 In S181, the CPU 62 determines the four lengths h11 to h14 of the four strip regions LA11 to LA14 on a plane defined by the horizontal axis indicating the length (that is, the number of pixels) and the vertical axis indicating the appearance frequency. Each point indicating the appearance frequency is plotted on the plane, and a graph in which the points are connected by a straight line (that is, the graph in FIG. 12) is generated. In the graph, three ranges R1 to R3 are obtained. The range R1 is a range in which a point indicating the frequency of appearance of one length h13 (that is, “1”) of one modification band-like region LA13 forms a peak. The range R2 is a range indicating zero appearance frequency. The range R3 is a range in which points indicating the appearance frequencies (that is, “3”) of the three lengths h11, h12, and h14 of the three character string belt-like regions LA11, LA12, and LA14 constitute a peak. More specifically, the range R3 is a range including the highest appearance frequency (that is, “3”).

Ｓ１８２では、ＣＰＵ６２は、Ｓ１８１で生成されたグラフを利用して、最高の出現頻度（即ち「３」）を含み、かつ、出現頻度がゼロより高い範囲Ｒ３を特定し、次いで、横軸が示す長さが範囲Ｒ３よりも小さく、かつ、出現頻度がゼロである範囲Ｒ２を特定する。そして、ＣＰＵ６２は、横軸上の範囲Ｒ２と範囲Ｒ３との境界の位置に対応する長さを閾値Ｔｈとして設定する。 In S 182, the CPU 62 uses the graph generated in S 181 to identify the range R 3 that includes the highest appearance frequency (that is, “3”) and has an appearance frequency higher than zero, and then the horizontal axis indicates A range R2 whose length is smaller than the range R3 and whose appearance frequency is zero is specified. Then, the CPU 62 sets the length corresponding to the position of the boundary between the range R2 and the range R3 on the horizontal axis as the threshold Th.

本実施例では、修飾物帯状領域の数（図１２の例では１個）が、文字列帯状領域の数（図１２の例では３個）よりも少ないという状況を想定している。そして、出現頻度がゼロである範囲Ｒ２と最高の出現頻度を含む範囲Ｒ３との境界の位置に対応する長さを閾値Ｔｈとして設定すれば、修飾物帯状領域と文字列帯状領域とを区別することができる。即ち、修飾物帯状領域ＬＡ１３の長さｈ１３は、通常、閾値Ｔｈ以下であり（範囲Ｒ１参照）、各文字列帯状領域ＬＡ１１，ＬＡ１２，ＬＡ１４の各長さｈ１１，ｈ１２，ｈ１４は、通常、閾値Ｔｈよりも大きい（範囲Ｒ３参照）。従って、本実施例でも、Ｓ１８４において、画像処理サーバ５０は、対象帯状領域が、修飾物帯状領域であるのか、文字列帯状領域であるのか、を適切に判断することができる。本実施例では、範囲Ｒ３、範囲Ｒ２が、それぞれ、「第１の範囲」、「第２の範囲」の一例である。 In the present embodiment, it is assumed that the number of modifier band-like regions (one in the example of FIG. 12) is smaller than the number of character string belt-like regions (three in the example of FIG. 12). Then, if the length corresponding to the position of the boundary between the range R2 in which the appearance frequency is zero and the range R3 including the highest appearance frequency is set as the threshold Th, the modifier strip-shaped region and the character string strip-shaped region are distinguished. be able to. That is, the length h13 of the modified band-like region LA13 is usually equal to or less than the threshold Th (see the range R1), and the lengths h11, h12, h14 of the character string belt-like regions LA11, LA12, LA14 are usually the threshold value. It is larger than Th (see range R3). Therefore, also in the present embodiment, in S184, the image processing server 50 can appropriately determine whether the target strip-shaped region is a modified strip-shaped region or a character string strip-shaped region. In the present embodiment, the range R3 and the range R2 are examples of the “first range” and the “second range”, respectively.

（第３実施例；図１３）
本実施例では、図５のＳ１８１及びＳ１８２の処理に代えて、図１３のＳ１８１及びＳ１８２の処理が実行される。本実施例では、修飾物帯状領域の数が、文字列帯状領域の数よりも多い状況でも、対象帯状領域が、修飾物帯状領域であるのか、文字列帯状領域であるのか、を適切に判断することができる。 (Third embodiment; FIG. 13)
In the present embodiment, the processes of S181 and S182 of FIG. 13 are executed instead of the processes of S181 and S182 of FIG. In the present embodiment, even when the number of modifier band-like areas is larger than the number of character string band-like areas, it is appropriately determined whether the target band-like area is a modifier band-like area or a character string band-like area. can do.

Ｓ１８１は、図１２の第２実施例のＳ１８１と同様である。ただし、修飾物帯状領域の数が多いので、各修飾物帯状領域の各長さの出現頻度を示す点がピークを構成する範囲Ｒ４が、最高の出現頻度を含む。範囲Ｒ５は、出現頻度ゼロを示す範囲である。範囲Ｒ６は、各文字列帯状領域の各長さの出現頻度を示す点がピークを構成する範囲である。 S181 is the same as S181 in the second embodiment of FIG. However, since the number of the modified band-like regions is large, a range R4 in which a point indicating the appearance frequency of each length of each modified band-like region forms a peak includes the highest appearance frequency. The range R5 is a range that shows zero appearance frequency. The range R6 is a range in which a point indicating the appearance frequency of each length of each character string belt-shaped region constitutes a peak.

Ｓ１８２では、ＣＰＵ６２は、Ｓ１８１で生成されたグラフを利用して、出現頻度がゼロより高い複数個の範囲Ｒ４，Ｒ６のうち、横軸が示す長さが最大である範囲Ｒ６を特定し、次いで、横軸が示す長さが範囲Ｒ６よりも小さく、かつ、出現頻度がゼロである範囲Ｒ５を特定する。そして、ＣＰＵ６２は、横軸上の範囲Ｒ５と範囲Ｒ６との境界の位置に対応する長さを閾値Ｔｈとして設定する。 In S182, the CPU 62 uses the graph generated in S181 to identify the range R6 having the maximum length indicated by the horizontal axis among the plurality of ranges R4 and R6 whose appearance frequency is higher than zero, and then The range R5 in which the length indicated by the horizontal axis is smaller than the range R6 and the appearance frequency is zero is specified. Then, the CPU 62 sets the length corresponding to the position of the boundary between the range R5 and the range R6 on the horizontal axis as the threshold Th.

本実施例によると、修飾物帯状領域の数が、文字列帯状領域の数よりも多くても、閾値Ｔｈを適切に設定することができる。本実施例では、範囲Ｒ６、範囲Ｒ５が、それぞれ、「第１の範囲」、「第２の範囲」の一例である。 According to the present embodiment, the threshold value Th can be appropriately set even if the number of the modifier band-like regions is larger than the number of character string belt-like regions. In the present embodiment, the range R6 and the range R5 are examples of the “first range” and the “second range”, respectively.

（第４実施例；図１４）
本実施例では、図５の修飾物解析処理に代えて、図１４の修飾物解析処理が実行される。図１４に示されるように、本実施例では、スキャン画像ＳＩは、日本語のセンテンスを含む。当該センテンスは、漢字を含む文字列Ｃ２２（即ち「私の名前は」）を含んでおり、当該文字列の上側に平仮名のルビＣ２１（即ち「なまえ」）が付されている。即ち、ルビは、漢字を修飾するための修飾物である。この場合、図４のＳ１６４では、３個の帯状領域ＬＡ２１〜ＬＡ２３が決定される。帯状領域ＬＡ２１は、ルビＣ２１を含み、修飾物帯状領域である。また、帯状領域ＬＡ２２，ＬＡ２３は、それぞれ、文字列Ｃ２２，Ｃ２３を含む文字列帯状領域である。 (Fourth embodiment; FIG. 14)
In the present embodiment, the modified product analysis process of FIG. 14 is executed instead of the modified product analysis process of FIG. As shown in FIG. 14, in this embodiment, the scan image SI includes a Japanese sentence. The sentence includes a character string C22 including kanji (that is, “my name is”), and a hiragana ruby C21 (that is, “name”) is added to the upper side of the character string. That is, ruby is a modified product for modifying kanji. In this case, three strip regions LA21 to LA23 are determined in S164 of FIG. The band-shaped region LA21 includes the ruby C21 and is a modified band-shaped region. The band-like areas LA22 and LA23 are character string band-like areas including character strings C22 and C23, respectively.

Ｓ１８１〜Ｓ１８４は、図５のＳ１８１〜Ｓ１８４と同様である。従って、Ｓ１８１では、３個の帯状領域ＬＡ２１〜ＬＡ２３の縦方向に沿った３個の長さｈ２１〜ｈ２３の平均値ｈａが算出され、Ｓ１８２では、閾値Ｔｈが算出される。ルビを含む帯状領域ＬＡ２１の縦方向の長さｈ２１は、通常、閾値Ｔｈ以下である。ＣＰＵ６２は、対象帯状領域が修飾物帯状領域（例えばＬＡ２１）であると判断する場合（Ｓ１８４でＹＥＳ）には、Ｓ１９６において、対象帯状領域と下行の帯状領域とを結合する。下行の帯状領域は、対象帯状領域の下側において、対象帯状領域の隣に存在する帯状領域である。そして、本実施例では、ＣＰＵ６２は、修飾物帯状領域であると判断される帯状領域ＬＡ２１と、下行の帯状領域である帯状領域ＬＡ２２と、を結合して、１個の修飾文字列帯状領域ＬＡ２２’を決定する。この際に、ＣＰＵ６２は、修飾文字列帯状領域ＬＡ２２’の基準位置として下行の帯状領域ＬＡ２２の基準位置を利用し、帯状領域ＬＡ２１の基準位置を利用しない。Ｓ１９８は、図５のＳ１９８と同様である。 S181 to S184 are the same as S181 to S184 in FIG. Therefore, in S181, the average value ha of the three lengths h21 to h23 along the vertical direction of the three strip regions LA21 to LA23 is calculated, and in S182, the threshold Th is calculated. The vertical length h21 of the strip-shaped area LA21 including ruby is usually equal to or less than the threshold value Th. If the CPU 62 determines that the target band-like region is a modified band-like region (for example, LA21) (YES in S184), the target band-like region and the lower band-like region are combined in S196. The lower belt-like region is a belt-like region that exists next to the target belt-like region below the target belt-like region. In this embodiment, the CPU 62 combines the band-like area LA21, which is determined to be a modified band-like area, and the band-like area LA22, which is a lower-banded band-like area, to form one modified character string band-like area LA22. 'Determine. At this time, the CPU 62 uses the reference position of the lower band area LA22 as the reference position of the modified character string band area LA22 'and does not use the reference position of the band area LA21. S198 is the same as S198 in FIG.

仮に、帯状領域ＬＡ２１と帯状領域ＬＡ２２とが結合されなければ、図６のＳ２２０では、文字列Ｃ２２（即ち「名前」）がルビＣ２１によって修飾されていない結合文字列が得られる。これに対し、本実施例では、帯状領域ＬＡ２２’内に文字列Ｃ２２とルビＣ２１との双方が含まれることになり、以降の処理では、文字列Ｃ２２とルビＣ２１とが１行の修飾文字列として扱われる。この結果、図６のＳ２２０では、文字列Ｃ２２がルビＣ２１によって修飾されている結合文字列が得られる。従って、画像処理サーバ５０は、スキャン画像ＳＩ内の修飾関係が維持されている結合画像ＣＩを表わす結合画像データを適切に生成することができる。また、画像処理サーバ５０は、スキャン画像ＳＩ内の修飾関係が維持されている処理済み画像ＰＩを表わす処理済み画像データを適切に生成することができる。 If the band-shaped area LA21 and the band-shaped area LA22 are not combined, a combined character string in which the character string C22 (that is, “name”) is not modified by the ruby C21 is obtained in S220 of FIG. On the other hand, in this embodiment, both the character string C22 and the ruby C21 are included in the belt-like area LA22 ′, and in the subsequent processing, the character string C22 and the ruby C21 are one line of the modified character string. Are treated as As a result, in S220 of FIG. 6, a combined character string in which the character string C22 is modified by ruby C21 is obtained. Therefore, the image processing server 50 can appropriately generate combined image data representing the combined image CI in which the modification relationship in the scan image SI is maintained. Further, the image processing server 50 can appropriately generate processed image data representing the processed image PI in which the modification relationship in the scanned image SI is maintained.

本実施例では、２行の文字列Ｃ２２，Ｃ２３、ルビＣ２１、ルビＣ２１が付されている漢字が、それぞれ、「Ｍ行の文字列」、「修飾物」、「被修飾文字」の一例である。３個の帯状領域ＬＡ２１〜ＬＡ２３、２個の帯状領域ＬＡ２２，ＬＡ２３、１個の帯状領域ＬＡ２１が、それぞれ、「複数個の帯状領域」、「Ｍ個の主帯状領域」、「副帯状領域」の一例である。また、帯状領域ＬＡ２２が、「近傍副帯状領域」の一例である。 In this embodiment, the kanji characters with the two-line character strings C22, C23, ruby C21, and ruby C21 are examples of “M-line character string”, “modifier”, and “modified character”, respectively. is there. Three belt-like regions LA21 to LA23, two belt-like regions LA22 and LA23, and one belt-like region LA21 are “a plurality of belt-like regions”, “M main belt-like regions”, and “sub-band-like regions”, respectively. It is an example. The band-shaped area LA22 is an example of the “neighboring sub-band area”.

（第５実施例；図１５）
本実施例では、図５の修飾物解析処理に代えて、図１５の修飾物解析処理が実行される。図１５に示されるように、本実施例では、スキャン画像ＳＩは、日本語のセンテンスを含む。第４実施例（図１４参照）と同様に、当該センテンスは、漢字を含む文字列Ｃ３３とルビＣ３２とを含む。また、当該センテンスは、文字列Ｃ３３を修飾するための修飾線を含む。この場合、図４のＳ１６４では、５個の帯状領域ＬＡ３１〜ＬＡ３５が決定される。帯状領域ＬＡ３２は、ルビＣ３２を含み、修飾物帯状領域である。帯状領域ＬＡ３４は、修飾線を含み、修飾物帯状領域である。また、各帯状領域ＬＡ３１，ＬＡ３３，ＬＡ３５は、それぞれ、文字列Ｃ３１，Ｃ３３，Ｃ３５を含む文字列帯状領域である。 (Fifth embodiment; FIG. 15)
In this embodiment, the modified product analysis process of FIG. 15 is executed instead of the modified product analysis process of FIG. As shown in FIG. 15, in the present embodiment, the scanned image SI includes a Japanese sentence. Similar to the fourth embodiment (see FIG. 14), the sentence includes a character string C33 including kanji and a ruby C32. The sentence includes a modification line for modifying the character string C33. In this case, in S164 of FIG. 4, five belt-like areas LA31 to LA35 are determined. The strip-shaped region LA32 includes the ruby C32 and is a modified strip-shaped region. The band-shaped region LA34 includes a modification line and is a modified band-shaped region. Further, each of the band-like areas LA31, LA33, LA35 is a character string band-like area including the character strings C31, C33, C35, respectively.

Ｓ１８１〜Ｓ１８４は、図５のＳ１８１〜Ｓ１８４と同様である。ＣＰＵ６２は、対象帯状領域が修飾物帯状領域（例えばＬＡ３２，ＬＡ３４）であると判断する場合（Ｓ１８４でＹＥＳ）には、Ｓ１８５において、距離ｄ１と距離ｄ２とを算出する。距離ｄ１は、対象帯状領域と上行の帯状領域との間の縦方向に沿った距離である。距離ｄ２は、対象帯状領域と下行の帯状領域との間の縦方向に沿った距離である。 S181 to S184 are the same as S181 to S184 in FIG. When the CPU 62 determines that the target belt-like region is a modified belt-like region (for example, LA32, LA34) (YES in S184), the CPU 62 calculates the distance d1 and the distance d2 in S185. The distance d1 is a distance along the vertical direction between the target belt-like region and the ascending belt-like region. The distance d2 is a distance along the vertical direction between the target strip-shaped region and the descending strip-shaped region.

Ｓ１８６では、ＣＰＵ６２は、距離ｄ１が距離ｄ２未満であるのか否かを判断する。ＣＰＵ６２は、距離ｄ１が距離ｄ２未満であると判断する場合（Ｓ１８６でＹＥＳ）には、Ｓ１９５において、対象帯状領域と上行の帯状領域とを結合する。一方、ＣＰＵ６２は、距離ｄ１が距離ｄ２以上であると判断する場合（Ｓ１８６でＮＯ）には、Ｓ１９６において、対象帯状領域と下行の帯状領域とを結合する。Ｓ１９５、Ｓ１９６は、それぞれ、図５のＳ１９５、図１４のＳ１９６と同様である。また、Ｓ１９８は、図５のＳ１９８と同様である。 In S186, the CPU 62 determines whether or not the distance d1 is less than the distance d2. When determining that the distance d1 is less than the distance d2 (YES in S186), the CPU 62 combines the target belt-like region and the upper belt-like region in S195. On the other hand, if the CPU 62 determines that the distance d1 is greater than or equal to the distance d2 (NO in S186), the target band-shaped area and the lower band-shaped area are combined in S196. S195 and S196 are the same as S195 of FIG. 5 and S196 of FIG. 14, respectively. S198 is the same as S198 in FIG.

本実施例では、帯状領域ＬＡ３２が修飾物帯状領域であると判断される対象帯状領域である場合には、ＣＰＵ６２は、帯状領域ＬＡ３１と帯状領域ＬＡ３２との間の距離ｄ１と、帯状領域ＬＡ３２と帯状領域ＬＡ３３との間の距離ｄ２と、を算出する（Ｓ１８５）。そして、ＣＰＵ６２は、距離ｄ１が距離ｄ２以上であると判断するので（Ｓ１８６でＮＯ）、帯状領域ＬＡ３２を下行の帯状領域ＬＡ３３に結合して、１個の修飾文字列帯状領域を決定する（Ｓ１９６）。また、帯状領域ＬＡ３４が修飾物帯状領域であると判断される対象帯状領域である場合には、ＣＰＵ６２は、帯状領域ＬＡ３４と上記の修飾文字列帯状領域との間の距離ｄ１と、帯状領域ＬＡ３４と帯状領域ＬＡ３５との間の距離ｄ２と、を算出する（Ｓ１８５）。そして、ＣＰＵ６２は、距離ｄ１が距離ｄ２未満であると判断するので（Ｓ１８６でＹＥＳ）、帯状領域ＬＡ３４を上行の帯状領域である上記の修飾文字列帯状領域に結合して、新たな１個の修飾文字列帯状領域ＬＡ３３’を決定する（Ｓ１９５）。このように、ＣＰＵ６２は、距離ｄ１及び距離ｄ２に基づいて、修飾物帯状領域と判断される対象帯状領域を上行の帯状領域及び下行の帯状領域のどちらに結合すべきかを適切に決定することができる。 In the present embodiment, when the band-shaped area LA32 is a target band-shaped area that is determined to be a modified band-shaped area, the CPU 62 determines the distance d1 between the band-shaped area LA31 and the band-shaped area LA32, the band-shaped area LA32, and A distance d2 between the belt-like region LA33 and the belt-like region LA33 is calculated (S185). Then, since the CPU 62 determines that the distance d1 is equal to or greater than the distance d2 (NO in S186), the band area LA32 is combined with the lower band area LA33 to determine one modified character string band area (S196). ). Further, when the band-shaped area LA34 is a target band-shaped area that is determined to be a modified band-shaped area, the CPU 62 determines the distance d1 between the band-shaped area LA34 and the modified character string band-shaped area and the band-shaped area LA34. And a distance d2 between the belt-shaped area LA35 and the belt-shaped area LA35 (S185). Since the CPU 62 determines that the distance d1 is less than the distance d2 (YES in S186), the CPU 62 joins the band area LA34 to the above-described modified character string band area, which is the upper band area, and creates a new one. The modified character string strip area LA33 ′ is determined (S195). As described above, the CPU 62 can appropriately determine which of the upper belt-like region and the lower belt-like region is to be combined with the target belt-like region that is determined as the modified belt-like region based on the distance d1 and the distance d2. it can.

最終的に得られる修飾文字列帯状領域ＬＡ３３’は、文字列Ｃ３３とルビＣ３２と修飾線との全てを含むことになり、以降の処理では、これらの全てが１行の修飾文字列として扱われる。従って、画像処理サーバ５０は、スキャン画像ＳＩ内の修飾関係が維持されている結合画像ＣＩを表わす結合画像データを適切に生成することができる。この結果、画像処理サーバ５０は、スキャン画像ＳＩ内の修飾関係が維持されている処理済み画像ＰＩを表わす処理済み画像データを適切に生成することができる。本実施例では、例えば、帯状領域ＬＡ３２が修飾物帯状領域であると判断される対象帯状領域である場合に、上行の帯状領域ＬＡ３１、下行の帯状領域ＬＡ３３が、それぞれ、「第１の主帯状領域」、「第２の主帯状領域」の一例である。そして、距離ｄ１、距離ｄ２が、それぞれ、「第１の距離」、「第２の距離」の一例である。 The finally obtained modified character string strip area LA33 ′ includes all of the character string C33, the ruby C32, and the modification line, and in the subsequent processing, all of these are treated as one line of the modified character string. . Therefore, the image processing server 50 can appropriately generate combined image data representing the combined image CI in which the modification relationship in the scan image SI is maintained. As a result, the image processing server 50 can appropriately generate processed image data representing the processed image PI in which the modification relationship in the scan image SI is maintained. In the present embodiment, for example, when the band-shaped area LA32 is a target band-shaped area that is determined to be a modified band-shaped area, the upper band-shaped area LA31 and the lower-line band-shaped area LA33 are respectively “first main band-shaped. This is an example of “region” and “second main strip region”. The distance d1 and the distance d2 are examples of the “first distance” and the “second distance”, respectively.

（第６実施例；図１６）
本実施例では、図５の修飾物解析処理に代えて、図１６の修飾物解析処理が実行される。本実施例では、スキャン画像ＳＩは、図１５の第５実施例と同じ日本語のセンテンスを含む。 (Sixth embodiment; FIG. 16)
In the present embodiment, the modified product analysis process of FIG. 16 is executed instead of the modified product analysis process of FIG. In the present embodiment, the scanned image SI includes the same Japanese sentence as in the fifth embodiment of FIG.

Ｓ１８１〜Ｓ１８４は、図５のＳ１８１〜Ｓ１８４と同様である。ＣＰＵ６２は、対象帯状領域が修飾物帯状領域（例えばＬＡ３２，ＬＡ３４）であると判断する場合（Ｓ１８４でＹＥＳ）には、Ｓ１８７において、対象帯状領域に対応する射影ヒストグラムを生成する。具体的には、ＣＰＵ６２は、まず、スキャン画像データＳＩＤの中から、対象帯状領域を表わす部分画像データを取得し、当該部分画像データに対して二値化処理を実行する。当該二値化処理の内容は、図３のＳ１１０と同様である。そして、ＣＰＵ６２は、二値データを利用して、射影ヒストグラムを生成する。当該射影ヒストグラムは、二値データを構成する各画素を縦方向に射影する場合におけるＯＮ画素（即ち「１」を示す画素）の頻度の分布を示す。当該射影ヒストグラムでは、ルビ又は修飾線が、頻度がゼロより高い範囲で表わされる。 S181 to S184 are the same as S181 to S184 in FIG. If the CPU 62 determines that the target band-like area is a modified band-like area (for example, LA32, LA34) (YES in S184), the CPU 62 generates a projection histogram corresponding to the target band-like area in S187. Specifically, the CPU 62 first acquires partial image data representing the target band-like region from the scanned image data SID, and executes binarization processing on the partial image data. The contents of the binarization process are the same as S110 in FIG. And CPU62 produces | generates a projection histogram using binary data. The projection histogram shows the frequency distribution of ON pixels (that is, pixels indicating “1”) when the pixels constituting the binary data are projected in the vertical direction. In the projection histogram, ruby or modification lines are represented in a range where the frequency is higher than zero.

Ｓ１８８では、ＣＰＵ６２は、Ｓ１８７で生成された射影ヒストグラムを利用して、修飾物帯状領域に含まれる修飾物が、修飾線であるのか、ルビであるのか、を判断する。具体的には、ＣＰＵ６２は、射影ヒストグラムが特定の分布を示す場合には、修飾物が修飾線であると判断し、射影ヒストグラムが特定の分布を示さない場合には、修飾物がルビであると判断する。上記の特定の分布は、修飾線の特徴を示す分布であり、ゼロより高い頻度値が横方向に沿って所定の長さ以上に亘って連続する分布である。上記の所定の長さは、修飾線の存在を特定可能な長さであればどのような値であってもよいが、例えば、１個の文字の横方向の長さよりも大きい値である。なお、変形例では、特定の分布は、ゼロより高い一定の頻度値が横方向に沿って所定の長さ以上に亘って連続する分布であってもよい。 In S188, the CPU 62 determines whether the modifier included in the modifier belt-like region is a modifier line or ruby using the projection histogram generated in S187. Specifically, the CPU 62 determines that the modified product is a modified line when the projection histogram shows a specific distribution, and the modified product is ruby when the projected histogram does not show the specific distribution. Judge. The above-mentioned specific distribution is a distribution indicating the characteristics of the modified line, and is a distribution in which frequency values higher than zero are continuous over a predetermined length along the horizontal direction. The predetermined length may be any value as long as the presence of the decoration line can be specified. For example, the predetermined length is a value larger than the horizontal length of one character. In the modification, the specific distribution may be a distribution in which a certain frequency value higher than zero is continuous over a predetermined length along the horizontal direction.

ＣＰＵ６２は、修飾物が修飾線であると判断する場合（Ｓ１８８で「修飾線」）には、Ｓ１９５において、対象帯状領域と上行の帯状領域とを結合する。一方、ＣＰＵ６２は、修飾物がルビであると判断する場合（Ｓ１８８で「ルビ」）には、Ｓ１９６において、対象帯状領域と下行の帯状領域とを結合する。Ｓ１９５、Ｓ１９６は、それぞれ、図５のＳ１９５、図１４のＳ１９６と同様である。また、Ｓ１９８は、図５のＳ１９８と同様である。 If the CPU 62 determines that the modification is a modification line (“modification line” in S188), in S195, the target band area and the upper band area are combined. On the other hand, if the CPU 62 determines that the modified product is ruby (“ruby” in S188), the target belt-like region and the lower belt-like region are combined in S196. S195 and S196 are the same as S195 of FIG. 5 and S196 of FIG. 14, respectively. S198 is the same as S198 in FIG.

本実施例では、ＣＰＵ６２は、修飾物帯状領域であると判断される対象帯状領域に対応する射影ヒストグラムを利用して、修飾線とルビとを判別することができる。例えば、帯状領域ＬＡ３２に対応する射影ヒストグラムが特定の分布を示さないので、ＣＰＵ６２は、帯状領域ＬＡ３２内の修飾物がルビであると判断することができる。また、例えば、帯状領域ＬＡ３４に対応する射影ヒストグラムが特定の分布を示すので、ＣＰＵ６２は、帯状領域ＬＡ３４内の修飾物が修飾線であると判断することができる。このように、ＣＰＵ６２は、修飾物帯状領域であると判断される対象帯状領域に対応する射影ヒストグラムを利用して、対象帯状領域を上行の帯状領域及び下行の帯状領域のどちらに結合すべきかを適切に決定することができる。本実施例では、Ｓ１８７で生成される射影ヒストグラムが、「第１の射影ヒストグラム」の一例である。 In the present embodiment, the CPU 62 can determine the modification line and the ruby using the projection histogram corresponding to the target band-shaped area that is determined to be the modification band-shaped area. For example, since the projection histogram corresponding to the strip-shaped area LA32 does not show a specific distribution, the CPU 62 can determine that the modified product in the strip-shaped area LA32 is ruby. Further, for example, since the projection histogram corresponding to the band-shaped area LA34 shows a specific distribution, the CPU 62 can determine that the modification in the band-shaped area LA34 is a modification line. In this way, the CPU 62 uses the projection histogram corresponding to the target band-like area determined to be the modifier band-like area, and determines whether the target band-like area should be combined with the upper or lower band-like area. Can be determined appropriately. In the present embodiment, the projection histogram generated in S187 is an example of the “first projection histogram”.

以上、本発明の具体例を詳細に説明したが、これらは例示にすぎず、特許請求の範囲を限定するものではない。特許請求の範囲に記載の技術には以上に例示した具体例を様々に変形、変更したものが含まれる。上記の実施例の変形例を以下に列挙する。 Specific examples of the present invention have been described in detail above, but these are merely examples and do not limit the scope of the claims. The technology described in the claims includes various modifications and changes of the specific examples illustrated above. The modifications of the above embodiment are listed below.

（変形例１）ＣＰＵ６２は、結合画像ＣＩ（図１１）を表わす結合画像データを生成することなく、再配置画像ＲＩを表わす再配置画像データを生成してもよい。具体的には、ＣＰＵ６２は、スキャン画像データＳＩＤから、３個の帯状領域ＬＡ１１，ＬＡ１２’，ＬＡ１４（図５参照）を表わす３個の部分画像データを取得する。そして、ＣＰＵ３２は、帯状領域ＬＡ１２’を表わす第２の部分画像データを分断して、文字列「Ｆ」を表わす第１の分断画像データと、文字列「Ｇ〜Ｊ」を表わす第２の分断画像データと、を生成する。また、ＣＰＵ６２は、帯状領域ＬＡ１４を表わす第３の部分画像データを分断して、文字列「ＫＬ」を表わす第３の分断画像データと、文字列「Ｍ」を表わす第４の分断画像データと、を生成する。次いで、ＣＰＵ６２は、帯状領域ＬＡ１１を表わす第１の部分画像データと第１の分断画像データとを結合して、文字列「Ａ〜Ｆ」を表わす第１の中間画像データを生成し、第２の分断画像データと第３の分断画像データとを結合して、文字列「Ｇ〜Ｌ」を表わす第２の中間画像データを生成する。そして、ＣＰＵ６２は、文字列「Ａ〜Ｆ」と文字列「Ｇ〜Ｌ」と文字列「Ｍ」とが縦方向に沿って並ぶように、第１の中間画像データと第２の中間画像データと第４の分断画像データとを再配置領域ＲＡ内に再配置して、再配置画像ＲＩを表わす再配置画像データを生成する。本変形例では、再配置画像データが、「対象画像データ」の一例である。なお、上記の各実施例では、再配置画像データが拡大されて処理済み画像データＰＩＤが生成されるが、再配置画像データがそのままスキャン画像データＳＩＤ内に上書きされることによって、処理済み画像データが生成されてもよい。本変形例では、処理済み画像データが、「対象画像データ」の一例であると考えることもできる。 (Modification 1) The CPU 62 may generate rearranged image data representing the rearranged image RI without generating the combined image data representing the combined image CI (FIG. 11). Specifically, the CPU 62 acquires three partial image data representing the three belt-like areas LA11, LA12 ', LA14 (see FIG. 5) from the scanned image data SID. Then, the CPU 32 divides the second partial image data representing the strip-shaped area LA12 ′, and the first divided image data representing the character string “F” and the second divided image data representing the character string “G to J”. And image data. In addition, the CPU 62 divides the third partial image data representing the band-shaped area LA14, the third divided image data representing the character string “KL”, and the fourth divided image data representing the character string “M”. , Generate. Next, the CPU 62 combines the first partial image data representing the band-shaped area LA11 and the first divided image data to generate first intermediate image data representing the character string “A to F”, and the second The divided image data and the third divided image data are combined to generate second intermediate image data representing the character string “GL”. Then, the CPU 62 sets the first intermediate image data and the second intermediate image data so that the character strings “A to F”, the character strings “G to L”, and the character string “M” are arranged along the vertical direction. And the fourth divided image data are rearranged in the rearrangement area RA to generate rearranged image data representing the rearranged image RI. In the present modification, the rearranged image data is an example of “target image data”. In each of the above embodiments, the rearranged image data is enlarged and processed image data PID is generated. However, the rearranged image data is directly overwritten in the scan image data SID, so that the processed image data is processed. May be generated. In the present modification, the processed image data can be considered as an example of “target image data”.

（変形例２）「修飾線」は、一重線でなくてもよく、二重線、破線、波線等であってもよい。また、例えば、図１４の第４実施例では、日本語の漢字に平仮名のルビＣ２１が付される状況を想定している。これに代えて、例えば、中国語の漢字にピンインのルビが付されていてもよい。即ち、「漢字」は、日本語の漢字に限られず、中国語の漢字も含む。また、「修飾物」は、上記の各実施例で例示されるもの（即ち、修飾線、漢字に付されるルビ）に限られず、例えば、文字の上側に付されるベクトル、日本語の片仮名（例えばソフトウェア）の上側に付される平仮名のルビ（例えばそふとうぇあ）等であってもよい。また、「修飾物」は、例えば、第１の言語（例えば英語のsoftware）の上側に付される第２の言語のルビ（例えば日本語のソフトウェア）等であってもよい。 (Modification 2) The “modification line” may not be a single line, but may be a double line, a broken line, a wavy line, or the like. Further, for example, in the fourth example of FIG. 14, a situation is assumed in which a Japanese kanji is added with a hiragana ruby C21. Instead, for example, Pinyin ruby may be attached to Chinese characters. That is, “kanji” is not limited to Japanese kanji and includes Chinese kanji. Further, the “modifier” is not limited to those exemplified in each of the above-described embodiments (that is, a modified line, a ruby attached to a kanji character), for example, a vector attached to the upper side of a character, Japanese katakana Hiragana ruby (for example, software) attached to the upper side of (for example, software) may be used. In addition, the “modifier” may be, for example, a ruby (for example, Japanese software) of the second language attached to the upper side of the first language (for example, English software).

（変形例３）例えば、縦方向の上側から下側に向かって、第１の文字列、当該第１の文字列を修飾するための修飾線、ルビ、当該ルビが付されている漢字を含む第２の文字列が順に並んでいる状況を想定する。このような状況では、図１５の修飾物解析処理において、ＣＰＵ６２は、修飾線を含む帯状領域が修飾物帯状領域であると判断する（Ｓ１８４でＹＥＳ）。次いで、Ｓ１８５では、ＣＰＵ６２は、修飾線を含む修飾物帯状領域と第１の文字列を含む文字列帯状領域との間の距離ｄ１を算出し、さらに、修飾線を含む修飾物帯状領域とルビを含む修飾物帯状領域との間の距離ではなく、修飾線を含む修飾物帯状領域と第２の文字列を含む文字列帯状領域との間の距離ｄ２を算出する。この場合、ＣＰＵ６２は、距離ｄ１が距離ｄ２未満であると判断し（Ｓ１８６でＹＥＳ）、第１の文字列を含む文字列帯状領域と、修飾線を含む修飾物帯状領域と、を結合する（Ｓ１９５）。また、ＣＰＵ６２は、ルビを含む帯状領域が修飾物帯状領域であると判断する（Ｓ１８４でＹＥＳ）。次いで、Ｓ１８５では、ＣＰＵ６２は、ルビを含む修飾物帯状領域と修飾線を含む修飾物帯状領域との間の距離ではなく、ルビを含む修飾物帯状領域と第１の文字列を含む文字列帯状領域との間の距離ｄ１を算出し、さらに、ルビを含む修飾物帯状領域と第２の文字列を含む文字列帯状領域との間の距離ｄ２を算出する。この場合、ＣＰＵ６２は、距離ｄ１が距離ｄ２以上であると判断し（Ｓ１８６でＮＯ）、第２の文字列を含む文字列帯状領域と、ルビを含む修飾物帯状領域と、を結合してもよい（Ｓ１９６）。本変形例では、修飾線を含む修飾物帯状領域とルビを含む修飾物帯状領域とが「副帯状領域」の一例である。また、第１の文字列を含む文字列帯状領域、第２の文字列を含む文字列帯状領域が、それぞれ、「第１の主帯状領域」、「第２の主帯状領域」の一例である。 (Modification 3) For example, from the upper side to the lower side in the vertical direction, the first character string, the modification line for modifying the first character string, ruby, and the kanji to which the ruby is attached are included. Assume a situation in which the second character strings are arranged in order. In such a situation, in the modification analysis process of FIG. 15, the CPU 62 determines that the band-like region including the modification line is the modifier band-like region (YES in S184). Next, in S185, the CPU 62 calculates a distance d1 between the modifier strip-shaped region including the modifier line and the character string strip-shaped region including the first character string, and further, the modifier strip-shaped region including the modifier line and the ruby. The distance d2 between the modified strip-shaped region including the modification line and the character string strip-shaped region including the second character string is calculated instead of the distance between the modified strip-shaped region including the character string. In this case, the CPU 62 determines that the distance d1 is less than the distance d2 (YES in S186), and combines the character string belt-like region including the first character string and the modifier belt-like region including the modifier line ( S195). Further, the CPU 62 determines that the belt-like region including ruby is the modified product belt-like region (YES in S184). Next, in S185, the CPU 62 does not measure the distance between the modification strip-shaped region including ruby and the modification strip-shaped region including the modification line, but the character strip strip including the modification strip-shaped region including the ruby and the first character string. A distance d1 between the region and the character string belt-like region including the second character string is calculated. In this case, the CPU 62 determines that the distance d1 is equal to or greater than the distance d2 (NO in S186), and combines the character string strip region including the second character string and the modifier strip region including ruby. Good (S196). In the present modification, the modified strip-shaped region including the modified line and the modified strip-shaped region including ruby are examples of the “sub-banded region”. In addition, the character string belt-like region including the first character string and the character string belt-like region including the second character string are examples of the “first main belt region” and the “second main belt region”, respectively. .

（変形例４）図４のＳ１６６〜Ｓ１７４では、最大の合計下辺長さが算出された評価範囲内の中間位置が基準位置として決定される。これに代えて、ＣＰＵ６２は、最大の合計下辺長さが算出された評価範囲内の最上端の位置又は最下端の位置を基準位置として決定してもよい。本変形例では、評価範囲内の最上端の位置又は最下端の位置が、「最大の評価値に対応する１個の評価範囲のうちの第２方向に沿った特定位置」の一例である。また、別の変形例では、ＣＰＵ６２は、例えば、対象帯状領域の縦方向の予め決められた位置（例えば、中間位置、上端位置、下端位置等）を基準位置として決定してもよい。即ち、「基準位置」は、Ｍ行の文字列を結合するための基準の位置であればよい。 (Modification 4) In S166 to S174 in FIG. 4, an intermediate position within the evaluation range in which the maximum total lower side length is calculated is determined as the reference position. Instead, the CPU 62 may determine the uppermost position or the lowermost position within the evaluation range where the maximum total lower side length is calculated as the reference position. In the present modification, the uppermost position or the lowermost position in the evaluation range is an example of “a specific position along the second direction in one evaluation range corresponding to the maximum evaluation value”. In another modification, for example, the CPU 62 may determine a predetermined position (for example, an intermediate position, an upper end position, a lower end position, etc.) in the vertical direction of the target band-shaped region as the reference position. That is, the “reference position” may be a reference position for combining character strings of M lines.

（変形例５）上記の実施例では、画像処理サーバ５０が、スキャン画像データＳＩＤに対して画像処理（即ち図２のＳ１００〜Ｓ５００の各処理）を実行して処理済み画像データＰＩＤを生成し、当該処理済み画像データＰＩＤを多機能機１０に送信する（Ｓ６００）。これに代えて、多機能機１０が、スキャン画像データＳＩＤに対して画像処理を実行して処理済み画像データＰＩＤを生成してもよい（即ち画像処理サーバ５０が存在しなくてもよい）。本変形例では、多機能機１０が、「画像処理装置」の一例である。 (Modification 5) In the above embodiment, the image processing server 50 performs image processing on the scanned image data SID (that is, each processing of S100 to S500 in FIG. 2) to generate processed image data PID. Then, the processed image data PID is transmitted to the multi-function device 10 (S600). Alternatively, the multi-function device 10 may perform image processing on the scanned image data SID to generate processed image data PID (that is, the image processing server 50 may not exist). In the present modification, the multi-function device 10 is an example of an “image processing apparatus”.

（変形例６）画像処理サーバ５０によって実行される画像処理の対象は、スキャン画像データＳＩＤでなくてもよく、文書作成ソフト、表編集ソフト、描画作成ソフト等によって生成されるデータであってもよい。即ち、「原画像データ」は、スキャン対象シートのスキャンによって得られるデータに限られず、他の種類のデータであってもよい。 (Modification 6) The target of image processing executed by the image processing server 50 may not be the scanned image data SID, but may be data generated by document creation software, table editing software, drawing creation software, or the like. Good. That is, the “original image data” is not limited to data obtained by scanning the scan target sheet, and may be other types of data.

（変形例７）上記の実施例では、スキャン画像ＳＩは、横方向の左側から右側に向かってセンテンスが進むと共に、縦方向の上側から下側に向かってセンテンスが進む文字列（即ち横書きの文字列）を含む。これに代えて、スキャン画像ＳＩは、縦方向の上側から下側に向かってセンテンスが進むと共に、横方向の右側から左側に向かってセンテンスが進む文字列（即ち縦書きの文字列）を含んでいてもよい。この場合、画像処理サーバ５０は、図４のＳ１６２及びＳ１６４において、横方向の射影ヒストグラムに基づいて、通常、帯状領域を決定することができない。従って、画像処理サーバ５０は、縦方向の射影ヒストグラムを生成して、帯状領域を決定する。その後、画像処理サーバ５０は、横方向の代わりに縦方向を利用し、縦方向の代わりに横方向を利用して、上記の実施例と同様の処理を実行すればよい。本変形例では、縦方向、上側、下側が、それぞれ、「第１方向」、「第１方向の第１側」、「第１方向の第２側」の一例である。横方向、右側、左側が、それぞれ、「第２方向」、「第２方向の第１側」、「第２方向の第２側」の一例である。 (Modification 7) In the above embodiment, the scanned image SI is a character string in which the sentence advances from the left side in the horizontal direction to the right side and the sentence advances in the vertical direction from the upper side to the lower side (that is, horizontally written characters). Column). Instead, the scanned image SI includes a character string (that is, a vertically written character string) in which the sentence advances from the upper side to the lower side in the vertical direction and the sentence advances from the right side in the horizontal direction to the left side. May be. In this case, the image processing server 50 cannot normally determine the band-like region based on the horizontal projection histogram in S162 and S164 of FIG. Therefore, the image processing server 50 generates a projection histogram in the vertical direction and determines the band-like region. Thereafter, the image processing server 50 may perform the same processing as in the above-described embodiment by using the vertical direction instead of the horizontal direction and using the horizontal direction instead of the vertical direction. In the present modification, the vertical direction, the upper side, and the lower side are examples of “first direction”, “first side in the first direction”, and “second side in the first direction”, respectively. The horizontal direction, the right side, and the left side are examples of “second direction”, “first side in second direction”, and “second side in second direction”, respectively.

（変形例８）上記の実施例では、画像処理サーバ５０のＣＰＵ６２がプログラム６６（即ちソフトウェア）を実行することによって、図２〜図１６の各処理が実現される。これに代えて、図２〜図１６の各処理のうちの少なくとも１つの処理は、論理回路等のハードウェアによって実現されてもよい。 (Modification 8) In the above embodiment, the CPU 62 of the image processing server 50 executes the program 66 (that is, software), thereby realizing the processes shown in FIGS. Instead, at least one of the processes in FIGS. 2 to 16 may be realized by hardware such as a logic circuit.

また、本明細書または図面に説明した技術要素は、単独であるいは各種の組合せによって技術的有用性を発揮するものであり、出願時請求項記載の組合せに限定されるものではない。また、本明細書または図面に例示した技術は複数目的を同時に達成するものであり、そのうちの一つの目的を達成すること自体で技術的有用性を持つものである。 The technical elements described in this specification or the drawings exhibit technical usefulness alone or in various combinations, and are not limited to the combinations described in the claims at the time of filing. In addition, the technology illustrated in the present specification or the drawings achieves a plurality of objects at the same time, and has technical utility by achieving one of the objects.

２：通信システム、４：インターネット、１０：多機能機、５０：画像処理サーバ、５２：ネットワークインターフェース、６０：制御部、６２：ＣＰＵ、６４：メモリ、６６：プログラム、ＳＩ：スキャン画像、ＰＩ：処理済み画像、ＴＯＢ：テキストオブジェクト、ＰＯＢ：写真オブジェクト、ＴＯＡ：テキストオブジェクト領域（テキスト領域）、ＰＯＡ：写真オブジェクト領域、ＬＡ１１〜ＬＡ１４：帯状領域、ｈ１１〜ｈ１４：縦方向の長さ、ＴＡ：目標領域、ＲＡ：再配置領域、ＣＩ：結合画像、ＤＩ１，ＤＩ２，ＤＩ３：分断画像、ＲＩ：再配置画像 2: Communication system, 4: Internet, 10: Multi-function device, 50: Image processing server, 52: Network interface, 60: Control unit, 62: CPU, 64: Memory, 66: Program, SI: Scanned image, PI: Processed image, TOB: Text object, POB: Photo object, TOA: Text object area (text area), POA: Photo object area, LA11-LA14: Band-shaped area, h11-h14: Vertical length, TA: Target Area, RA: rearranged area, CI: combined image, DI1, DI2, DI3: fragmented image, RI: rearranged image

Claims

An image processing apparatus,
An acquisition unit for acquiring original image data representing an original image, wherein the original image includes a character string of M lines (where M is an integer of 1 or more) and a plurality of characters constituting the character string of the M lines. And each of the M rows of character strings is composed of two or more characters arranged in a first direction, and the M rows of character strings. Are arranged along a second direction orthogonal to the first direction when the M is an integer equal to or greater than 2, and the modifier is a first side or a second side of the character to be modified in the second direction. A plurality of pixels that are present in the vicinity of the character to be modified on the side and constitute a text region including the character string of the M rows and the modifier in the original image are the M rows included in the text region. A first type pixel constituting the character string or the modified product and the text area. It is and a second-type pixels constituting the background of the character string and the modification of the M rows, and the acquisition unit,
A determination unit for determining a plurality of band-shaped areas from the original image, wherein the plurality of band-shaped areas include M main band-shaped areas including the character string of the M rows and a sub-area including the modifier. A band-shaped region, and the determination unit,
Based on a plurality of lengths along the second direction of the plurality of belt-like regions, a specifying unit that specifies the sub-band region from the plurality of belt-like regions,
For each of the plurality of strip regions, a reference position determination unit that determines a reference position from the entire range along the second direction of the strip regions;
The modified product included in the sub-band region, and one row included in a nearby main band region existing in the vicinity of the sub-band region on the first side or the second side in the second direction of the sub-band region. Are treated as a single-line modifier character string, and the plurality of characters constituting the M-line character string and the modifier are rearranged in a state different from the original image. A target image data generation unit for generating target image data representing an image;
Equipped with a,
The reference position determination unit
A unit region determination unit for determining a plurality of unit regions from the text region, wherein each of the plurality of unit regions is a region circumscribing a first type pixel group in the text region; Each first-type pixel included in the first-type pixel group is adjacent to at least one other first-type pixel, the unit region determination unit,
For each of the plurality of strip-shaped regions, an evaluation value calculation unit that calculates a plurality of evaluation values corresponding to a plurality of evaluation ranges of the entire range along the second direction of the strip-shaped region, Each of the plurality of evaluation values is the sum of the lengths of the one or more specific sides when there is one or more specific sides in the corresponding evaluation range, and the specific side is the unit The evaluation value calculation unit, which is a side on the second side in the second direction of the region,
For each of the plurality of strip-shaped regions, the reference position determination unit may include the first of the evaluation ranges corresponding to the maximum evaluation value among the plurality of evaluation values calculated for the strip-shaped region. A specific position along two directions is determined as the reference position;
The target image data generation unit does not use the reference positions determined for the sub-band areas, and the M reference positions determined for the M main band areas are the same positions in the second direction. Generating the target image data representing the target image including the target character string of one line in which the character strings of the M lines are linearly coupled along the first direction.
Image processing device.

The modified product is a modified line arranged on the second side in the second direction of the character to be modified, and is the modified line extending along the first direction,
2. The image processing apparatus according to claim 1, wherein the neighboring main belt-like region is present in the vicinity of the sub-band region on the first side in the second direction of the sub-band region.

The modified character is a Chinese character,
The modification is a ruby arranged on the first side in the second direction of the character to be modified,
2. The image processing apparatus according to claim 1, wherein the neighboring main belt-like region is present in the vicinity of the sub-band region on the second side in the second direction of the sub-band region.

The target image data generation unit
A distance calculation unit that calculates a first distance and a second distance, wherein the first distance is close to the sub-band region and the sub-band region on the first side in the second direction. a first main strip regions present, be a distance along the second direction between said second distance, said a sub-band region, the sub in the second side of the second direction a second main strip-like region existing in the vicinity of the strip-like region is the distance along the second direction between, with the distance calculation unit,
The target image data generation unit
If the first distance is less than the second distance, determine the first main band region as the neighboring main band region;
4. The image processing apparatus according to claim 1, wherein when the first distance is equal to or greater than the second distance, the second main belt-like region is determined as the neighboring main belt-like region. 5. .

Before Symbol target image data generation unit,
A first histogram generation unit configured to generate a first projection histogram corresponding to the sub-band region, wherein the first projection histogram includes each pixel constituting the sub-band region in the text region; The first histogram generation unit, which is a histogram showing a frequency distribution of the first type pixels when projecting along two directions,
The target image data generation unit
When the first projection histogram shows a specific distribution in which frequency values higher than zero are continuous over a predetermined length along the first direction, the first side in the second direction Determining the first main belt-like region present in the vicinity of the sub-band region as the neighboring main belt-like region;
When the first projection histogram does not show the specific distribution, a second main belt-like region existing in the vicinity of the sub-band region on the second side in the second direction is set as the vicinity main belt-like region. The image processing device according to claim 1, wherein the image processing device is determined.

The specific part is:
A length along the second direction of the target belt-shaped region among the plurality of belt-shaped regions is set based on the plurality of lengths along the second direction of the plurality of belt-shaped regions. When it is equal to or less than a set value, the target band-like area is identified as the sub-band-like area,
6. The target belt-like region is specified as the main belt-like region when the length along the second direction of the target belt-like region is larger than the set value. 6. Image processing apparatus.

The specifying unit calculates an average value of the plurality of lengths along the second direction of the plurality of strip-like regions, and sets a value equal to or less than the average value as the set value. The image processing apparatus described.

The specific part has a plurality of lengths along the second direction of the plurality of strip-shaped regions on a plane defined by a first axis indicating a length and a second axis indicating an appearance frequency. Generate a graph showing the appearance frequency, set the length corresponding to the position of the boundary between the first range and the second range on the first axis as the set value,
The first range includes a highest appearance frequency and the appearance frequency is higher than zero, and the second range has a length indicated by the first axis smaller than the first range. The image processing apparatus according to claim 6, wherein the appearance frequency is in a range of zero.

The specific part has a plurality of lengths along the second direction of the plurality of strip-shaped regions on a plane defined by a first axis indicating a length and a second axis indicating an appearance frequency. Generate a graph showing the appearance frequency, set the length corresponding to the position of the boundary between the first range and the second range on the first axis as the set value,
The first range is a range in which a length indicated by the first axis is a maximum among a plurality of ranges on the first axis whose appearance frequency is higher than zero, and the second range is The image processing apparatus according to claim 6, wherein the length indicated by the first axis is a range that is smaller than the first range and has an appearance frequency of zero.

Before Symbol determining unit,
A second histogram generation unit configured to generate a second projection histogram using the original image data, wherein the second projection histogram includes pixels constituting the text area along the first direction; The second histogram generation unit, which is a histogram showing a frequency distribution of the first type pixels in the case of projecting,
10. The image processing apparatus according to claim 1, wherein the determination unit determines the plurality of strip-shaped regions from the text region by using the second projection histogram. 10.

A computer program for an image processing apparatus,
In the computer mounted on the image processing apparatus, the following steps, that is,
An acquisition step of acquiring original image data representing an original image, wherein the original image includes a character string of M lines (where M is an integer of 1 or more) and a plurality of characters constituting the character string of the M lines. And each of the M rows of character strings is composed of two or more characters arranged in a first direction, and the M rows of character strings. Are arranged along a second direction orthogonal to the first direction when the M is an integer equal to or greater than 2, and the modifier is a first side or a second side of the character to be modified in the second direction. A plurality of pixels that are present in the vicinity of the character to be modified on the side and constitute a text region including the character string of the M rows and the modifier in the original image are the M rows included in the text region. A first type pixel constituting the character string or the modifier, and the text region The second type includes a pixel, the said acquisition step of configuring the background of the character string and the modification of the M rows in the,
A determination step of determining a plurality of strip-shaped areas from the original image, wherein the plurality of strip-shaped areas are M main strip-shaped areas including the character string of the M rows and a sub-section including the modifier. A band-shaped region; and
A specifying step of identifying the sub-band region from the plurality of band regions based on a plurality of lengths along the second direction of the plurality of band regions;
A reference position determination step for determining a reference position from the entire range along the second direction of the strip-shaped region for each of the plurality of strip-shaped regions;
The modified product included in the sub-band region, and one row included in a nearby main band region existing in the vicinity of the sub-band region on the first side or the second side in the second direction of the sub-band region. Are treated as a single-line modifier character string, and the plurality of characters constituting the M-line character string and the modifier are rearranged in a state different from the original image. A target image data generation step for generating target image data representing an image;
Was executed,
The reference position determining step includes:
A unit region determining step for determining a plurality of unit regions from the text region, wherein each of the plurality of unit regions is a region circumscribing a first type pixel group in the text region; Each of the first type pixels included in the first type pixel group is adjacent to at least one other first type pixel, the unit region determination step,
For each of the plurality of belt-like regions, an evaluation value calculating step of calculating a plurality of evaluation values corresponding to a plurality of evaluation ranges of the entire range along the second direction of the belt-like region, Each of the plurality of evaluation values is the sum of the lengths of the one or more specific sides when there is one or more specific sides in the corresponding evaluation range, and the specific side is the unit The evaluation value calculating step, which is the second side of the region in the second direction,
The reference position determining step includes, for each of the plurality of band-like areas, the first of the evaluation ranges corresponding to the maximum evaluation value among the plurality of evaluation values calculated for the band-like area. A specific position along two directions is determined as the reference position;
The target image data generation step does not use the reference positions determined for the sub-band areas, and the M reference positions determined for the M main band areas are the same positions in the second direction. Generating the target image data representing the target image including the target character string of one line in which the character strings of the M lines are linearly coupled along the first direction.
Computer program.