JP2006304063A

JP2006304063A - Image processing apparatus, image processing method and program

Info

Publication number: JP2006304063A
Application number: JP2005124985A
Authority: JP
Inventors: Reiji Misawa; 玲司三沢
Original assignee: Canon Inc
Current assignee: Canon Inc
Priority date: 2005-04-22
Filing date: 2005-04-22
Publication date: 2006-11-02
Anticipated expiration: 2025-04-22
Also published as: JP4411244B2

Abstract

<P>PROBLEM TO BE SOLVED: To accurately discriminate a character region from a non-character region with respect to an image processing apparatus, image processing method and a program. <P>SOLUTION: This method has steps of generating a binary document image from a multi-level document image, extracting a determination target region from the generated binary image, applying each of first and second line thinning processing to the image of the determination target region, calculating each of color dispersion value of the multilevel document image for the position of each of line thinning processing, and determining whether the determination target region consists of characters or photographs on the basis of the calculated first and second color dispersion values. <P>COPYRIGHT: (C)2007,JPO&INPIT

Description

本発明は、文書画像の領域判定を好適に行うことができる画像処理装置、画像処理方法、プログラムに関する。 The present invention relates to an image processing apparatus, an image processing method, and a program that can suitably determine a region of a document image.

近年、カラープリンタやカラースキャナ等の普及により、カラー化された文書が増え、これをスキャンにより取り込んで電子ファイルとして保存したり、インターネット等を介して第三者等に送付する機会が増えてきている。しかし、フルカラーデータのままでは記憶装置や回線への負荷が大きいため、圧縮処理等の方法で取り扱うデータ量を小さくする必要がある。 In recent years, with the widespread use of color printers, color scanners, etc., the number of colorized documents has increased, and there has been an increased opportunity to capture and store them as electronic files or send them to third parties via the Internet. Yes. However, if the full color data is used as it is, the load on the storage device and the line is large, so it is necessary to reduce the amount of data handled by a method such as compression processing.

従来、カラー画像を圧縮する方法として、例えば、誤差拡散等で擬似階調を持った２値画像にして圧縮する方法、ＪＰＥＧ形式で圧縮する方法、８ビットのパレットカラーに変換を行ってＺＩＰ圧縮やＬＺＷ圧縮をする方法等があった。また、領域判定とＭＭＲによる２値圧縮とＺＩＰによる可逆圧縮と、ＪＰＥＧによる非可逆圧縮との組み合わせにより、通常の文字領域については高い品位が得られる圧縮方法等があった（例えば、特許文献１及び特許文献２参照）。 Conventionally, as a method for compressing a color image, for example, a method of compressing a binary image having a pseudo gradation by error diffusion or the like, a method of compressing in a JPEG format, and a ZIP compression by converting to an 8-bit palette color And a method of performing LZW compression. In addition, there is a compression method or the like in which high quality is obtained for a normal character region by combining region determination, binary compression by MMR, lossless compression by ZIP, and irreversible compression by JPEG (for example, Patent Document 1). And Patent Document 2).

また、従来、文書画像処理に関する技術としては、文書を光学的に入力し、文字を認識してテキストコードを出力する光学的文字認識装置（ＯＣＲ）の技術が存在する（例えば、特許文献３参照）。 Conventionally, as a technique related to document image processing, there is an optical character recognition apparatus (OCR) technique that optically inputs a document, recognizes characters, and outputs a text code (see, for example, Patent Document 3). ).

ＯＣＲでは、濃度射影（ヒストグラム）により文字行を切り出し（抽出）、さらに１文字単位の文字ブロック切り出し（抽出）を行う。文字ブロックの切り出しに際しては、文字行方向に濃度射影を取り、濃度射影値の変化に基づいて文字行を分離し、さらに、各文字行について、文字行と垂直方向に濃度射影を取ることで個々の文字ブロックを抽出する。また必要に応じて、標準的な文字サイズや文字ピッチの推定値、および行と垂直方向に濃度射影値等の情報を用いて、１文字単位の文字画像となる、最終的な文字ブロックを切り出す。切り出された文字ブロックは、縦横寸法の正規化を行った後に、所定の特徴データ抽出の処理が施される。特徴データが抽出された個々の文字ブロックは、予め求められている標準パタンとの類似度が計算され、最も類似度の高い文字が認識結果とされる。標準パタンの集合は認識辞書と呼ばれる。
特開２００２−０７７６３３号公報特開２００４−１２８８８０号公報特開２００３−３４６０８３号公報 In OCR, character lines are cut out (extracted) by density projection (histogram), and character blocks are cut out (extracted) in units of characters. When cutting out a character block, the density projection is taken in the direction of the character line, the character lines are separated based on the change in the density projection value, and each character line is taken separately by taking the density projection in the direction perpendicular to the character line. Extract a block of characters. Also, if necessary, a final character block that becomes a character image of one character is cut out using information such as a standard character size and character pitch estimated value, and a density projection value in a direction perpendicular to the line. . The extracted character block is subjected to normalization of vertical and horizontal dimensions and then subjected to predetermined feature data extraction processing. For each character block from which feature data is extracted, a similarity with a standard pattern obtained in advance is calculated, and a character with the highest similarity is taken as a recognition result. A set of standard patterns is called a recognition dictionary.
JP 2002-077633 A JP 2004-128880 A JP 2003-346083 A

特許文献１や特許文献２に記載の方法によれば、領域判定とＭＭＲによる２値圧縮とＺＩＰによる可逆圧縮と、ＪＰＥＧによる非可逆圧縮との組み合わせにより、通常の文字領域については高い品位が得られる。しかし、領域判定の結果において、文字ではない領域（写真領域など、以下、非文字）を文字であると誤判断する場合もあり、その際は逆に大きな画質劣化を生じるという問題があった。 According to the methods described in Patent Document 1 and Patent Document 2, high quality is obtained for a normal character area by combining area determination, binary compression by MMR, lossless compression by ZIP, and lossy compression by JPEG. It is done. However, as a result of area determination, there is a case where an area that is not a character (photo area or the like, hereinafter, non-character) is erroneously determined to be a character, and in that case, there is a problem that a large image quality deterioration occurs.

また、ＯＣＲ処理においては、文字ブロックとして切り出した領域が非文字であった場合、非文字に対して文字認識を行うことになる。非文字に対して文字認識を行ってしまうと、全体の処理速度を低下させてしまう他、意味のないテキストコードが認識結果の出力データに含まれてしまう場合もあり好ましくないという問題があった。 In the OCR process, if the area cut out as a character block is a non-character, character recognition is performed on the non-character. If character recognition is performed on non-characters, the overall processing speed is reduced, and there is a problem that meaningless text codes may be included in the output data of the recognition result, which is not preferable. .

本発明は、このような事情を考慮してなされたものであり、抽出した領域に対して、文字と非文字の属性判断を良好に行うことができる画像処理装置、画像処理方法、コンピュータプログラムを提供することを目的とする。 The present invention has been made in consideration of such circumstances, and an image processing apparatus, an image processing method, and a computer program capable of satisfactorily performing character and non-character attribute determination on an extracted region. The purpose is to provide.

上記課題を解決する為に、本発明の画像処理装置は、多値の文書画像から２値の文書画像を生成する２値化手段と、前記生成された２値画像から判定対象領域を抽出する領域抽出手段と、前記判定対象領域の画像を第１の細線化処理で細線化する第１細線化手段と、前記判定対象領域の画像を第２の細線化処理で細線化する第２細線化手段と、前記第１細線化手段による細線化結果に対応する位置に関して前記多値文書画像の第１の色分散値を算出する第１色分散値算出手段と、前記第２細線化手段による細線化結果に対応する位置に関して前記多値文書画像の第２の色分散値を算出する第２色分散値算出手段と、前記算出した第１の色分散値と第２の色分散値とに基づいて、前記判定対象領域が文字か写真かを判定する領域判定手段とを有することを特徴とする。 In order to solve the above problems, an image processing apparatus according to the present invention extracts binarization means for generating a binary document image from a multilevel document image, and a determination target region from the generated binary image. Area extraction means; first thinning means for thinning the image of the determination target area by a first thinning process; and second thinning for thinning the image of the determination target area by a second thinning process. Means, a first color dispersion value calculating means for calculating a first color dispersion value of the multi-value document image with respect to a position corresponding to a thinning result by the first thinning means, and a thin line by the second thinning means Based on the second color dispersion value calculating means for calculating the second color dispersion value of the multi-value document image with respect to the position corresponding to the conversion result, and the calculated first color dispersion value and second color dispersion value. Area determination means for determining whether the determination target area is a character or a photograph. And wherein the Rukoto.

上記課題を解決する為に、本発明の画像処理方法は、多値の文書画像から２値の文書画像を生成する２値化ステップと、前記生成された２値画像から判定対象領域を抽出する領域抽出ステップと、前記判定対象領域の画像を第１の細線化処理で細線化する第１細線化ステップと、前記判定対象領域の画像を第２の細線化処理で細線化する第２細線化ステップと、前記第１細線化ステップによる細線化結果に対応する位置に関して前記多値文書画像の第１の色分散値を算出する第１色分散値算出ステップと、前記第２細線化ステップによる細線化結果に対応する位置に関して前記多値文書画像の第２の色分散値を算出する第２色分散値算出ステップと、前記算出した第１の色分散値と第２の色分散値とに基づいて、前記判定対象領域が文字か写真かを判定する領域判定ステップとを有することを特徴とする。 In order to solve the above problems, an image processing method according to the present invention includes a binarization step for generating a binary document image from a multi-level document image, and extracting a determination target region from the generated binary image. An area extraction step; a first thinning step for thinning the image of the determination target area by a first thinning process; and a second thinning for thinning the image of the determination target area by a second thinning process. A first color dispersion value calculating step for calculating a first color dispersion value of the multi-value document image with respect to a position corresponding to a thinning result obtained by the first thinning step, and a thin line obtained by the second thinning step. A second color dispersion value calculating step of calculating a second color dispersion value of the multi-value document image with respect to a position corresponding to the conversion result, and the calculated first color dispersion value and second color dispersion value. And whether the judgment target area is a character And having a determining area determination step of determining.

本発明によれば、文字と非文字の領域判定を精度良く実行することが可能となる。したがって、この領域判定結果を圧縮技術に適用すると、良好な画質が得られると共に、圧縮効率を向上させることが可能となる。また、ＯＣＲ技術に対して適用すると、処理速度の向上と共に、意味のないテキストコードを出力してしまうことを抑え、認識率を向上させることが可能となる。 According to the present invention, it is possible to accurately perform the character / non-character region determination. Therefore, when this region determination result is applied to a compression technique, good image quality can be obtained and the compression efficiency can be improved. Moreover, when applied to the OCR technology, it is possible to improve the recognition rate by suppressing the output of meaningless text codes as well as the processing speed.

（実施例１）
以下の実施の形態では、例えばカラー複写機に搭載可能な、カラー画像の圧縮技術において本発明の領域判定技術を適用する例を説明する。尚、カラー複写機の機能としては、例えば、カラーコピー機能、カラープリント機能及びカラースキャナ機能があるが、本実施形態で説明する領域判定技術は、このうちカラーコピー機能及びカラースキャナ機能で適用可能である。具体的には、カラー原稿を読み取ることにより得られたカラー画像データを圧縮する際に用いられる圧縮技術に適用できる。また、カラースキャナ機能としては、例えば、カラー原稿を読み取ることにより得られたカラー画像データを圧縮して外部へ送信するデータ送信機能及び同カラー画像データを圧縮して複写機内部の記憶手段に記憶する保存機能がある。 Example 1
In the following embodiment, an example will be described in which the region determination technique of the present invention is applied to a color image compression technique that can be mounted on, for example, a color copying machine. The color copying machine functions include, for example, a color copy function, a color print function, and a color scanner function. The area determination technology described in the present embodiment is applicable to the color copy function and the color scanner function. It is. Specifically, the present invention can be applied to a compression technique used when compressing color image data obtained by reading a color original. As the color scanner function, for example, a data transmission function for compressing color image data obtained by reading a color original and transmitting the compressed data to the outside, and the color image data are compressed and stored in a storage unit inside the copier. There is a save function to do.

以下、図面を参照して、本発明を好適な実施形態に従って詳細に説明する。 Hereinafter, the present invention will be described in detail according to preferred embodiments with reference to the drawings.

図１は本発明の実施例によるシステム構成を示す概略図であり、ネットワーク通信機能を備えた複合機（ＭＦＰ）１０１とホストコンピュータ（以下、ＰＣ）１０２が、ネットワーク１０３等の伝送媒体で接続された環境を示す図である。 FIG. 1 is a schematic diagram showing a system configuration according to an embodiment of the present invention. A multifunction peripheral (MFP) 101 having a network communication function and a host computer (hereinafter referred to as PC) 102 are connected by a transmission medium such as a network 103. FIG.

また、点線１０４〜１０５は、処理／制御の流れを示すものとし、以下順に説明を行う。１０４は、ユーザがＭＦＰ１０１のスキャナより紙文書を読み込ませる処理を示す。その際、ユーザは、後述するＭＦＰ１０１のユーザーインターフェース（図２の２０３）より、送信する宛先（例えば、ＰＣ１０２）、各種送信設定（例えば、解像度、圧縮率）、データ書式（例えば、ＪＰＥＧ、ＴＩＦＦ、ＰＤＦ、ＰＤＦ高圧縮、ＰＤＦ（ＯＣＲ結果付き））を予め指定する。本実施例では、カラー画像の圧縮技術において本発明の領域判定方法を用いる例を説明するため、データ書式としてＰＤＦ高圧縮を指定した場合について説明を行う。ＰＤＦ高圧縮の技術詳細については後述する。１０５は、指定された送信設定及びデータ書式に基づいて、ＭＦＰ１０１の後述するソフトウェアあるいはハードウェア機能を利用してデータを生成し、指定された宛先に送信する処理を示す。ここで、ＰＣ１０２へ送信された画像は、ＰＤＦなどのファイルフォーマットで送信されることになるので、ＰＣ１０２の有する汎用的なビューアで閲覧可能である。 Dotted lines 104 to 105 indicate the flow of processing / control, and will be described in the following order. Reference numeral 104 denotes a process in which the user reads a paper document from the scanner of the MFP 101. At that time, the user can send a destination (for example, PC 102), various transmission settings (for example, resolution, compression rate), data format (for example, JPEG, TIFF, etc.) from the user interface (203 in FIG. 2) described later. PDF, PDF high compression, and PDF (with OCR result)) are designated in advance. In this embodiment, in order to explain an example of using the region determination method of the present invention in a color image compression technique, a case where PDF high compression is designated as a data format will be described. The technical details of PDF high compression will be described later. Reference numeral 105 denotes processing for generating data using a software or hardware function described later of the MFP 101 based on the specified transmission setting and data format, and transmitting the data to the specified destination. Here, since the image transmitted to the PC 102 is transmitted in a file format such as PDF, it can be viewed with a general-purpose viewer of the PC 102.

次に、図１におけるＭＦＰ１０１のハードウェアの詳細構成について図２を用いて説明する。 Next, a detailed hardware configuration of the MFP 101 in FIG. 1 will be described with reference to FIG.

ＭＦＰ１０１は、画像入力デバイスであるスキャナ部２０１、画像出力デバイスであるプリンタ部２０２、ＣＰＵやメモリ等で構成される制御ユニット（ＣｏｎｔｒｏｌｌｅｒＵｎｉｔ）２０４、ユーザーインターフェースである操作部２０３等を有する。制御ユニット２０４は、スキャナ部２０１、プリンタ部２０２、操作部２０３と接続し、一方では、ＬＡＮ２１９や一般の電話回線網である公衆回線（ＷＡＮ）２２０と接続することで、画像情報やデバイス情報の入出力を行うコントローラである。ＣＰＵ２０５はシステム全体を制御するコントローラである。ＲＡＭ２０６はＣＰＵ２０５が動作するためのシステムワークメモリであり、画像データを一時記憶するための画像メモリでもある。ＲＯＭ２１０はブートＲＯＭであり、システムのブートプログラムが格納されている。ＨＤＤ２１１はハードディスクドライブで、システム制御ソフトウェア、画像データを格納する。操作部Ｉ／Ｆ２０７は操作部（ＵＩ）２０３とのインターフェース部で、操作部２０３に表示するための画像データを操作部２０３に対して出力する。また、操作部２０３から本画像処理装置の使用者が入力した情報を、ＣＰＵ２０５に伝える役割をする。ネットワーク（Ｎｅｔｗｏｒｋ）２０８は本画像処理装置をＬＡＮ２１９に接続し、パケット形式の情報の入出力を行う。モデム（ＭＯＤＥＭ）２０９は本画像処理装置を公衆回線２２０に接続し、情報の復調・変調を行い入出力を行う。以上のデバイスがシステムバス２２１上に配置される。 The MFP 101 includes a scanner unit 201 that is an image input device, a printer unit 202 that is an image output device, a control unit 204 that includes a CPU and a memory, an operation unit 203 that is a user interface, and the like. The control unit 204 is connected to the scanner unit 201, the printer unit 202, and the operation unit 203. On the other hand, the control unit 204 is connected to a LAN 219 or a public line (WAN) 220, which is a general telephone line network, so that image information and device information can be stored. A controller that performs input and output. A CPU 205 is a controller that controls the entire system. A RAM 206 is a system work memory for the CPU 205 to operate, and is also an image memory for temporarily storing image data. A ROM 210 is a boot ROM, and stores a system boot program. An HDD 211 is a hard disk drive and stores system control software and image data. An operation unit I / F 207 is an interface unit with the operation unit (UI) 203 and outputs image data to be displayed on the operation unit 203 to the operation unit 203. Also, the CPU 205 serves to transmit information input by the user of the image processing apparatus from the operation unit 203 to the CPU 205. A network 208 connects the image processing apparatus to the LAN 219 and inputs / outputs packet format information. A modem (MODEM) 209 connects the image processing apparatus to the public line 220, and performs input / output by demodulating / modulating information. The above devices are arranged on the system bus 221.

イメージバスインターフェース（ＩｍａｇｅＢｕｓＩ／Ｆ）２１２はシステムバス２２１と画像データを高速で転送する画像バス２２２とを接続し、データ構造を変換するバスブリッジである。画像バス２２２は、例えば、ＰＣＩバスやＩＥＥＥ１３９４で構成される。 An image bus interface (Image Bus I / F) 212 is a bus bridge that connects a system bus 221 and an image bus 222 that transfers image data at high speed, and converts a data structure. The image bus 222 is composed of, for example, a PCI bus or IEEE1394.

画像バス２２２上には以下のデバイスが配置される。ラスターイメージプロセッサ（ＲＩＰ）２１３はＰＤＬコードを解析し、ビットマップイメージに展開する。デバイスＩ／Ｆ部２１４は、信号線２２３を介して画像入出力デバイスであるスキャナ部２０１、信号線２２４を介してプリンタ部２０２、をそれぞれ制御ユニット２０４に接続し、画像データの同期系／非同期系の変換を行う。スキャナ画像処理部２１５は、入力画像データに対し補正、加工、編集を行う。プリンタ画像処理部２１６は、プリンタ部２０２に出力すべきプリント出力画像データに対して、プリンタ部２０２に応じた補正、解像度変換等を行う。画像回転部２１７は入力された画像データの回転を行い出力する。画像圧縮部２１８は、多値画像データに対してはＪＰＥＧ圧縮伸長処理、または、デバイス固有の圧縮伸長処理を行い、２値画像画像データに対してはＪＢＩＧ、ＭＭＲ、ＭＨの圧縮伸長処理を行う。以上が図１におけるＭＦＰ１０１のハードウェアの詳細構成である。 The following devices are arranged on the image bus 222. A raster image processor (RIP) 213 analyzes the PDL code and develops it into a bitmap image. The device I / F unit 214 connects the scanner unit 201 which is an image input / output device via the signal line 223 and the printer unit 202 via the signal line 224 to the control unit 204, respectively. Perform system conversion. A scanner image processing unit 215 corrects, processes, and edits input image data. The printer image processing unit 216 performs correction, resolution conversion, and the like according to the printer unit 202 with respect to print output image data to be output to the printer unit 202. The image rotation unit 217 rotates the input image data and outputs it. The image compression unit 218 performs JPEG compression / decompression processing on multi-valued image data or device-specific compression / decompression processing, and JBIG, MMR, and MH compression / decompression processing on binary image data. . The above is the detailed hardware configuration of the MFP 101 in FIG.

次に、図２における制御ユニット２０４に実装されるソフトウェア構成について図３を用いて説明する。３０１はユーザーインターフェース（以下、ＵＩ）であり、オペレータが操作部２０３を用いてＭＦＰに対する各種操作・設定を行う際の、機器とユーザ操作との仲介を行うモジュールである。本モジュールは、オペレータの操作に従い、後述の各種モジュールに入力情報を転送して処理の依頼、或いはデータの設定等を行う。 Next, a software configuration installed in the control unit 204 in FIG. 2 will be described with reference to FIG. A user interface (hereinafter referred to as UI) 301 is a module that mediates between a device and a user operation when the operator performs various operations / settings on the MFP using the operation unit 203. In accordance with the operation of the operator, this module transfers input information to various modules to be described later, and requests processing or sets data.

３０２はアドレスブック（Ａｄｄｒｅｓｓ−Ｂｏｏｋ）、即ちデータの送付先、通信先等を管理するデータベースモジュールである。アドレスブック３０２の内容は操作部２０３からの操作を、ＵＩ３０１で検知し、データの追加、削除、取得が行われ、オペレータの操作により後述の各モジュールにデータの送付・通信先情報を与えるものとして使用されるものである。 Reference numeral 302 denotes an address book (Address-Book), that is, a database module that manages a data transmission destination, a communication destination, and the like. The contents of the address book 302 are detected by the UI 301 when an operation from the operation unit 203 is detected, and data is added, deleted, and acquired, and data transmission / communication destination information is given to each module described later by the operator's operation. It is what is used.

３０３はＷｅｂサーバモジュール（Ｗｅｂ−Ｓｅｒｖｅｒモジュール）であり、Ｗｅｂクライアント（例えば、ＰＣ１０２）からの要求により、本ＭＦＰの管理情報を通知するために使用される。この管理情報は、後述の統合送信部（Ｕｎｉｖｅｒｓａｌ−Ｓｅｎｄモジュール）３０４、後述のリモートコピースキャンモジュール（Ｒｅｍｏｔｅ−Ｃｏｐｙ−Ｓｃａｎモジュール）３０９、後述のリモートコピープリントモジュール（Ｒｅｍｏｔｅ−Ｃｏｐｙ−Ｐｒｉｎｔモジュール）３１０、後述の制御ＡＰＩ（Ｃｏｎｔｒｏｌ−ＡＰＩ）３１８を介して読み取られ、後述のＨＴＴＰモジュール３１２、ＴＣＰ／ＩＰ通信モジュール３１６、ネットワークドライバ（Ｎｅｔｗｏｒｋ−Ｄｒｉｖｅｒ）３１７を介してＷｅｂクライアントに通知される。Ｗｅｂサーバモジュール３０３はＷｅｂクライアントに渡すべき情報を、ＨＴＭＬ形式等のいわゆるＷｅｂページ（ホームページ）形式のデータとして作成する。必要に応じてＪａｖａ（登録商標）やＣＧＩプログラム等が用いられる。 Reference numeral 303 denotes a Web server module (Web-Server module), which is used to notify management information of the MFP in response to a request from a Web client (for example, the PC 102). This management information includes an after-mentioned integrated transmission unit (Universal-Send module) 304, a later-described remote copy scan module (Remote-Copy-Scan module) 309, a later-described remote copy print module (Remote-Copy-Print module) 310, The data is read via a control API (Control-API) 318, which will be described later, and notified to the Web client via an HTTP module 312, a TCP / IP communication module 316, and a network driver (Network-Driver) 317, which will be described later. The Web server module 303 creates information to be passed to the Web client as data in a so-called Web page (homepage) format such as an HTML format. Java (registered trademark), a CGI program, or the like is used as necessary.

３０４は統合送信部（Ｕｎｉｖｅｒｓａｌ−Ｓｅｎｄモジュール）、即ちデータの配信を司るモジュールであり、ＵＩ３０１を介してオペレータによって指定されたデータを、指示された通信（出力）先に配布するものである。また、オペレータにより、本ＭＦＰのスキャナ機能を使用して配布データの生成が指示された場合は、後述の制御ＡＰＩ３１８を介して本ＭＦＰのスキャナ２０１を動作させ、データの生成を行う。 Reference numeral 304 denotes an integrated transmission unit (Universal-Send module), that is, a module that manages data distribution, and distributes data designated by the operator via the UI 301 to an instructed communication (output) destination. When the operator instructs the generation of distribution data using the scanner function of the MFP, the scanner 201 of the MFP is operated via a control API 318 described later to generate data.

３０５は統合送信部３０４内で出力先にプリンタが指定された際に実行されるモジュールである。３０６は統合送信部３０４内で通信先にＥ−ｍａｉｌアドレスが指定された際に実行されるモジュールである。３０７は統合送信部３０４内で出力先にデータベースが指定された際に実行されるモジュールである。３０８は統合送信部３０４内で出力先に本ＭＦＰと同様のＭＦＰが指定された際に実行されるモジュールである。 A module 305 is executed when a printer is designated as an output destination in the integrated transmission unit 304. Reference numeral 306 denotes a module that is executed when an E-mail address is designated as a communication destination in the integrated transmission unit 304. A module 307 is executed when a database is designated as an output destination in the integrated transmission unit 304. A module 308 is executed when an MFP similar to the present MFP is designated as an output destination in the integrated transmission unit 304.

３０９はリモートコピースキャン（Ｒｅｍｏｔｅ−Ｃｏｐｙ−Ｓｃａｎ）モジュールであり、ＭＦＰ１０１のスキャナ機能を使用してスキャナ２０１で読み取った画像情報の出力先をネットワーク等で接続された他のＭＦＰのプリンタで出力し、本ＭＦＰ１０１単体で実現しているコピー機能と同等の処理を行うモジュールである。３１０はリモートコピープリント（Ｒｅｍｏｔｅ−Ｃｏｐｙ−Ｐｒｉｎｔ）モジュールであり、ネットワーク等で接続された他のＭＦＰのスキャナで読み取った画像情報を入力元として得られた画像情報をＭＦＰ１０１のプリンタ機能を使用して出力することにより、同様にＭＦＰ１０１単体で実現しているコピー機能と同等の処理を行うモジュールである。ボックスモジュール３１１はスキャン画像もしくはＰＤＬプリント画像をＨＤＤに格納し、格納した画像のプリンタ機能による印刷、統合送信（Ｕｎｉｖｅｒｓａｌ−Ｓｅｎｄ）機能による送信、ＨＤＤに格納した文書の削除、グルーピング（個別ＢＯＸへの格納）、ＢＯＸ間移動、ＢＯＸ間コピーなどの管理機能を提供する。なお、ボックスモジュール３１１は、ＨＴＴＰモジュール３１２及びＴＣＰ／ＩＰモジュール３１６によって通信機能が提供される。 Reference numeral 309 denotes a remote copy scan (remote-copy-scan) module, which outputs the output destination of image information read by the scanner 201 using the scanner function of the MFP 101 to a printer of another MFP connected via a network or the like. This module performs processing equivalent to the copy function realized by the MFP 101 alone. A remote copy print (Remote-Copy-Print) module 310 uses the printer function of the MFP 101 to obtain image information obtained by using image information read by a scanner of another MFP connected via a network or the like as an input source. This is a module that, by outputting, performs processing equivalent to the copy function realized by the MFP 101 alone. The box module 311 stores the scanned image or PDL print image in the HDD, prints the stored image using the printer function, transmits the image using the integrated transmission (Universal-Send) function, deletes the document stored in the HDD, and groups (to individual BOX). Management functions such as storage), movement between BOXes, and copying between BOXes. The box module 311 is provided with a communication function by the HTTP module 312 and the TCP / IP module 316.

３１２はＨＴＴＰモジュールであり、本ＭＦＰがＨＴＴＰにより通信する際に使用され、後述のＴＣＰ／ＩＰ通信モジュール３１６により前述のＷｅｂサーバモジュール３０３、Ｗｅｂプルプリントモジュール３１１に通信機能を提供する。３１３はｌｐｒモジュールであり、後述のＴＣＰ／ＩＰ通信モジュール３１６により前述の統合送信部３０４内のプリンタモジュール３０５に通信機能を提供するものである。３１４はＳＭＴＰモジュールであり、後述のＴＣＰ／ＩＰ通信モジュール３１６により統合送信部３０４内のＥ−ｍａｉｌモジュール３０６に通信機能を提供する。３１５はＳＬＭ、即ちＳａｌｕｔａｔｉｏｎ−Ｍａｎａｇｅｒモジュールであり、後述のＴＣＰ／ＩＰ通信３１６モジュールにより前述の統合送信部３０４内のデータベースモジュール３１７、ＤＰモジュール３１８、及びリモートコピースキャンモジュール３０９、リモートコピープリントモジュール３１０に通信機能を提供する。 Reference numeral 312 denotes an HTTP module, which is used when the MFP performs communication using HTTP, and provides a communication function to the Web server module 303 and the Web pull print module 311 described above using a TCP / IP communication module 316 described later. Reference numeral 313 denotes an lpr module, which provides a communication function to the printer module 305 in the integrated transmission unit 304 described above by a TCP / IP communication module 316 described later. Reference numeral 314 denotes an SMTP module, which provides a communication function to the E-mail module 306 in the integrated transmission unit 304 by a TCP / IP communication module 316 described later. Reference numeral 315 denotes an SLM, that is, a Salutation-Manager module, which is connected to the database module 317, the DP module 318, the remote copy scan module 309, and the remote copy print module 310 in the integrated transmission unit 304 by a TCP / IP communication 316 module described later. Provides communication functions.

３１６はＴＣＰ／ＩＰ通信モジュールであり、後述のネットワークドライバ３１６を用いて、前述の各種モジュールにネットワーク通信機能を提供する。３１７はネットワークドライバであり、ネットワークに物理的に接続される部分を制御するものである。 A TCP / IP communication module 316 provides a network communication function to the various modules described above using a network driver 316 described later. Reference numeral 317 denotes a network driver that controls a portion physically connected to the network.

３１８は制御ＡＰＩであり、統合送信部３０４等の上流モジュールに対し、後述のジョブマネージャ（Ｊｏｂ−Ｍａｎａｇｅｒ）３１９等の下流モジュールとのインターフェイスを提供するものであり、上流及び下流のモジュール間の依存関係を軽減し、それぞれの流用性を高めるものである。３１９はジョブマネージャであり、前述の各種モジュールより制御ＡＰＩ３１８を介して指示される処理を解釈し、後述の各モジュール（３２０、３２４、３２６）に指示を与えるものである。また、ジョブマネージャ３１９は、ＦＡＸジョブの制御も含め本ＭＦＰ内で実行される種々のジョブを一元管理するものである。 A control API 318 provides an interface with an upstream module such as the integrated transmission unit 304 and a downstream module such as a job manager (Job-Manager) 319 described later. It will reduce the relationship and increase the applicability of each. A job manager 319 interprets processing instructed from the various modules described above via the control API 318, and gives instructions to each module (320, 324, 326) described later. The job manager 319 centrally manages various jobs executed in the MFP, including control of FAX jobs.

３２０はコーデックマネージャ（ＣＯＤＥＣ−Ｍａｎａｇｅｒ）であり、ジョブマネージャ３１９が指示する処理の中でデータの各種圧縮・伸長を管理・制御するものである。３２１はＦＢＥエンコーダモジュール（ＦＢＥ−Ｅｎｃｏｄｅｒ）であり、ジョブマネージャ３１９、後述のスキャンマネージャ（Ｓｃａｎ−Ｍａｎａｇｅｒ）３２４により実行されるスキャン処理により読み込まれたデータをＦＢＥフォーマットにより圧縮するものである。３２２はＪＰＥＧコーデックモジュール（ＪＰＥＧ−ＣＯＤＥＣ）であり、ジョブマネージャ３１９、スキャンマネージャ３２４により実行されるスキャン処理、及びプリントマネージャ（Ｐｒｉｎｔ−Ｍａｎａｇｅｒ）３２６により実行される印刷処理において、読み込まれたデータのＪＰＥＧ圧縮及び印刷データのＪＰＥＧ展開処理を行うものである。３２３はＭＭＲコーデック（ＭＭＲ−ＣＯＤＥＣ）であり、ジョブマネージャ３１９、スキャンマネージャ３２４により実行されるスキャン処理、及びプリントマネージャ３２６により実行される印刷処理において、スキャナから読み込まれたデータのＭＭＲ圧縮及びプリンタへ出力すべき印刷データのＭＭＲ伸長処理を行うものである。 Reference numeral 320 denotes a codec manager (CODEC-Manager) that manages and controls various compression / decompression of data in the process instructed by the job manager 319. Reference numeral 321 denotes an FBE encoder module (FBE-Encoder) which compresses data read by a scan process executed by a job manager 319 and a later-described scan manager (Scan-Manager) 324 in an FBE format. A JPEG codec module (322) is a JPEG codec module (JPEG-CODEC). The scan data executed by the job manager 319 and the scan manager 324 and the print processing executed by the print manager (Print-Manager) 326 are used for JPEG of the read data. JPEG expansion processing for compression and print data is performed. Reference numeral 323 denotes an MMR codec (MMR-CODEC). In the scan process executed by the job manager 319 and the scan manager 324 and the print process executed by the print manager 326, the data read from the scanner is compressed by MMR and sent to the printer. MMR expansion processing for print data to be output is performed.

３２４はスキャンマネージャ（Ｓｃａｎ−Ｍａｎａｇｅｒ）であり、ジョブマネージャ３１９が指示するスキャン処理を管理・制御するものである。３２５はＳＣＳＩドライバであり、スキャンマネージャ３２４と本ＭＦＰが内部的に接続しているスキャナ部２０１との通信を行うものである。３２６はプリントマネージャ（Ｐｒｉｎｔ−Ｍａｎａｇｅｒ）であり、ジョブマネージャ３１９が指示する印刷処理を管理・制御するものである。３２７はエンジンインターフェース（Ｅｎｇｉｎｅ−Ｉ／Ｆ）であり、プリントマネージャ３２６とプリンタ部２０２とのＩ／Ｆを提供する。３２８はパラレルポートドライバであり、Ｗｅｂプルプリント３１１がパラレルポートを介して不図示の出力機器にデータを出力する際のＩ／Ｆを提供する。 Reference numeral 324 denotes a scan manager (Scan-Manager) that manages and controls the scan processing instructed by the job manager 319. Reference numeral 325 denotes a SCSI driver, which communicates between the scan manager 324 and the scanner unit 201 to which the MFP is internally connected. Reference numeral 326 denotes a print manager (Print-Manager) that manages and controls print processing instructed by the job manager 319. Reference numeral 327 denotes an engine interface (Engine-I / F) that provides an I / F between the print manager 326 and the printer unit 202. A parallel port driver 328 provides an I / F when the Web pull print 311 outputs data to an output device (not shown) via the parallel port.

次にＡｄｄｒｅｓｓ−Ｂｏｏｋ３０２の詳細について説明する。このＡｄｄｒｅｓｓ−Ｂｏｏｋ３０２は、ＭＦＰ１０１内の不揮発性の記憶装置（不揮発性メモリやハードディスクなど）に保存されており、この中には、ネットワークに接続された他の機器の特徴が記載されている。例えば、以下に列挙するようなものが含まれている。
・機器の正式名やエイリアス名，
・機器のネットワークアドレス，
・機器の処理可能なネットワークプロトコル，
・機器の処理可能なドキュメントフォーマット，
・機器の処理可能な圧縮タイプ，
・機器の処理可能なイメージ解像度，
・プリンタ機器の場合の給紙可能な紙サイズや給紙段情報，
・サーバ（コンピュータ）機器の場合のドキュメントを格納可能なフォルダ名
以下に説明する各アプリケーションは、上記Ａｄｄｒｅｓｓ−Ｂｏｏｋ３０２に記載された情報により配信先の特徴を判別することが可能となる。 Next, details of the Address-Book 302 will be described. The Address-Book 302 is stored in a non-volatile storage device (non-volatile memory, hard disk, or the like) in the MFP 101, and in this, features of other devices connected to the network are described. For example, those listed below are included.
・ The official name and alias name of the device,
・ Device network address,
・ Network protocol that can be processed by equipment,
-Document formats that can be processed by the device,
・ Compression type that can be processed by equipment,
・ Image resolution that can be processed by the device,
-Paper size and paper source information for printer devices,
-Folder name capable of storing documents in the case of server (computer) device Each application described below can determine the characteristics of the delivery destination based on the information described in the above Address-Book 302.

このＡｄｄｒｅｓｓ−Ｂｏｏｋ３０２を参照して、ＭＦＰ１０１はデータを送信することができる。例えば、リモートコピーアプリケーションは、配信先に指定された機器の処理可能な解像度情報を前記Ａｄｄｒｅｓｓ−Ｂｏｏｋ３０２より判別し、それに従い、スキャナにより読み込まれた画像２値画像を公知のＭＭＲ圧縮を用いて圧縮し、それを公知のＴＩＦＦ（ＴａｇｇｅｄＩｍａｇｅＦｉｌｅＦｏｒｍａｔ）化し、ＳＬＭ３０３に通して、ネットワーク上のプリンタ機器に送信する。ＳＬＭ３０３については、詳細には説明しないが、公知のＳａｌｕｔａｔｉｏｎ−Ｍａｎａｇｅｒ）と呼ばれる機器制御情報などを含んだネットワークプロトコルの一種である。 With reference to the Address-Book 302, the MFP 101 can transmit data. For example, the remote copy application discriminates the resolution information that can be processed by the device designated as the delivery destination from the Address-Book 302, and compresses the binary image read by the scanner using the known MMR compression in accordance therewith. Then, it is converted into a known TIFF (Tagged Image File Format), passed through the SLM 303, and transmitted to the printer device on the network. Although not described in detail, the SLM 303 is a type of network protocol including device control information called a known Salutation-Manager.

次に、図１におけるホストコンピュータ１０２のハードウェア構成について図４を用いて説明する。ホストコンピュータ１０２ついては、一般的なパーソナルコンピュータの構成、機能を有しており、周辺機器であるモニタ４０１、キーボード・マウス４０２と、ホストコンピュータ１０２全体を制御する中央処理装置ＣＰＵ４０３、アプリケーションやデータを記憶するハードディスク４０５、メモリ４０６等からなる。また、ネットワーク・インターフェース４０６を介してネットワーク１０３等の伝送媒体に接続されている。 Next, the hardware configuration of the host computer 102 in FIG. 1 will be described with reference to FIG. The host computer 102 has the configuration and functions of a general personal computer, and includes peripheral devices such as a monitor 401, a keyboard / mouse 402, a central processing unit CPU 403 that controls the entire host computer 102, and stores applications and data. A hard disk 405, a memory 406, and the like. Further, it is connected to a transmission medium such as the network 103 via the network interface 406.

次に、前述したＰＤＦ高圧縮に関して、図５及び図６を用いて説明する。 Next, the above-described PDF high compression will be described with reference to FIGS.

ここでいうＰＤＦ高圧縮とは、カラー画像の圧縮技術であり、領域判定を行い、各領域の属性に応じて、ＭＭＲによる２値圧縮とＪＰＥＧによる非可逆圧縮とを適応的に変えて圧縮することにより、圧縮率を高くできるとともに、文字領域については高い品位が得られる圧縮方法である。 High-PDF compression here is a color image compression technique, which performs region determination and performs compression by adaptively changing between binary compression by MMR and irreversible compression by JPEG according to the attribute of each region. Thus, the compression rate can be increased and the character area can be obtained with high quality.

多値画像である入力画像（５０１）は、２値化部（５０２）で２値化され、２値画像（５０３）が生成される。領域判定部（５０４）は、２値画像（５０３）を入力とし、例えば、所定値の画素（例えば、黒画素）の輪郭線追跡等により画素塊を取得し、当該画素塊の大きさや位置に基づいてグループ化していくことにより領域を形成し、当該形成された領域内の画素塊の大きさや並び方などに基づいて文字領域を判別して、文字領域情報を生成する。文字領域情報は、文字領域の位置や大きさを示す情報である。また、領域判定部（５０４）が文字領域を判定することで、それ以外の部分は、写真やイラストや背景等の自然（階調）画像を示す写真領域として判定する。文字切り出し部（５０５）は、領域判定部（５０４）により、文字領域と判定した領域に対して、文字領域内における各文字（単位文字領域）を文字切り矩形として切り出し、文字切り矩形情報を生成する。文字切り矩形情報は、文字きり矩形の位置や大きさを示す情報である。文字領域情報、及び文字切り矩形情報は、文字領域情報（５０６）として情報管理されるものとする。また、２値画像（５０３）を入力とし、領域判定部（５０４）により文字領域と判定された領域について、文字領域毎の２値画像である部分２値画像（５０７）を生成する。 The input image (501), which is a multi-valued image, is binarized by the binarization unit (502), and a binary image (503) is generated. The region determination unit (504) receives the binary image (503) as an input, acquires a pixel block by, for example, contour tracking of a pixel (for example, a black pixel) of a predetermined value, and sets the size and position of the pixel block. A region is formed by grouping based on this, and a character region is determined based on the size and arrangement of pixel blocks in the formed region, thereby generating character region information. The character area information is information indicating the position and size of the character area. Further, the area determination unit (504) determines the character area, and the other part is determined as a photographic area indicating a natural (gradation) image such as a photograph, an illustration, or a background. The character cutout unit (505) cuts out each character (unit character region) in the character region as a character cut rectangle for the region determined by the region determination unit (504) as the character region, and generates character cut rectangle information To do. The character cut rectangle information is information indicating the position and size of the character cut rectangle. It is assumed that the character area information and the character cut rectangle information are managed as character area information (506). Also, a binary image (503) is input, and a partial binary image (507), which is a binary image for each character area, is generated for the area determined as a character area by the area determination unit (504).

一方で、入力画像（５０１）は、縮小部（５１２）により縮小（又は低解像度化）され、縮小多値画像（５１３）が生成される。代表色抽出部（５１０）は、部分２値画像（５０７）を入力とし、文字領域情報（５０６）及び縮小多値画像（５１３）を参照しながら、文字切り矩形の代表色を算出し、その結果を文字色情報（５１１）として情報管理する（尚、この処理の詳細については、特許文献２参照）。文字領域穴埋め部（５１４）は、縮小多値画像（５１３）を入力とし、文字領域情報（５０６）及び部分２値画像（５０７）を参照しながら、縮小多値画像（５１３）の各文字領域あるいは文字切り矩形を、その周辺色で塗り潰す処理を行う（尚、この処理の詳細については、特許文献１参照）。 On the other hand, the input image (501) is reduced (or reduced in resolution) by the reduction unit (512), and a reduced multi-valued image (513) is generated. The representative color extraction unit (510) receives the partial binary image (507) as input, calculates the representative color of the character cut rectangle while referring to the character region information (506) and the reduced multi-value image (513), The result is managed as character color information (511) (refer to Patent Document 2 for details of this processing). The character area filling unit (514) receives the reduced multi-valued image (513) as input, and refers to each character area of the reduced multi-valued image (513) while referring to the character area information (506) and the partial binary image (507). Alternatively, a process of filling the character-cut rectangle with its peripheral color is performed (refer to Patent Document 1 for details of this process).

以上の処理の後、部分２値画像（５０７）は各々、ＭＭＲ圧縮部（５０８）により圧縮コード１（５０９）として圧縮される。また、文字領域穴埋め部（５１４）にて穴埋めされた穴埋め多値画像は、ＪＰＥＧ圧縮部（５１５）により圧縮コード２（５１６）として圧縮される。 After the above processing, each partial binary image (507) is compressed as a compressed code 1 (509) by the MMR compression unit (508). In addition, the multi-valued image filled in with the character region filling unit (514) is compressed as compressed code 2 (516) by the JPEG compression unit (515).

このようにして、各構成要素から得られる文字領域情報（５０６）、圧縮コード１（５０９）、文字色情報（５１１）、圧縮コード２（５１６）のデータ群を結合した圧縮データ（５１７）が生成される。この圧縮データ（５１７）を、更に、ＰＤＦなどで可逆圧縮することにより、ＰＤＦ高圧縮データが生成される。 Thus, the compressed data (517) obtained by combining the data groups of the character area information (506), the compressed code 1 (509), the character color information (511), and the compressed code 2 (516) obtained from each component is obtained. Generated. The compressed data (517) is further reversibly compressed with PDF or the like to generate PDF highly compressed data.

図６は、前述したように生成された圧縮データ（５１７）を伸長する概略構成を示す図である。ＭＭＲ伸長部（６０１）は圧縮コード１（５０９）を入力とし、ＭＭＲ伸長処理を行い、部分２値画像（６０２）を生成する。ＪＰＥＧ伸長部（６０５）は圧縮コード２（５１６）を入力し、ＪＰＥＧ伸長処理を行い、さらに拡大部（６０６）で拡大処理を行うことで、多値画像（６０７）を生成する。合成部（６０３）は文字領域情報（５０６）を参照しながら、文字色情報（５１１）を部分２値画像（６０２）の黒画素に割り当て、その文字色が割り当てられた部分２値画像を多値画像（６０７）の上に合成して表示する。この際、部分２値画像（６０２）の白画素は透明色が割り当てられており、多値画像（６０７）を透過する。 FIG. 6 is a diagram showing a schematic configuration for expanding the compressed data (517) generated as described above. The MMR decompression unit (601) receives the compression code 1 (509) as input, performs MMR decompression processing, and generates a partial binary image (602). The JPEG decompression unit (605) receives the compression code 2 (516), performs JPEG decompression processing, and further performs enlargement processing in the enlargement unit (606), thereby generating a multi-value image (607). The synthesizing unit (603) assigns the character color information (511) to the black pixels of the partial binary image (602) while referring to the character area information (506), and selects the partial binary image to which the character color is assigned. It is synthesized and displayed on the value image (607). At this time, the white pixels of the partial binary image (602) are assigned a transparent color and pass through the multi-valued image (607).

このように、画像伸長装置は、画像圧縮装置により生成された圧縮データを伸長し、画像を復元する。 As described above, the image decompression device decompresses the compressed data generated by the image compression device and restores the image.

図７は、図５及び図６で使用、または生成される画像の概略図を示す。 FIG. 7 shows a schematic diagram of the images used or generated in FIGS. 5 and 6.

７０１は、入力画像（５０１）を示す。７０２は、２値画像（５０３）を示す。 Reference numeral 701 denotes an input image (501). Reference numeral 702 denotes a binary image (503).

７０３は、領域判定部（５０４）により、文字領域、写真領域に領域判定された結果を示す。ここで、７０４及び７０６は文字領域と判定され、７０５は写真領域として判定されたものとする。 Reference numeral 703 denotes a result of the area determination performed by the area determination unit (504) in the character area and the photograph area. Here, it is assumed that 704 and 706 are determined as character areas, and 705 is determined as a photograph area.

７０７、７０８は、領域判定部（５０４）により文字領域と判定された領域の部分２値画像（５０７）を示す。 Reference numerals 707 and 708 denote partial binary images (507) of areas determined as character areas by the area determination unit (504).

７０９は、文字切り出し部（５０５）により切り出された文字切り矩形の概略図を示す。７１０は、文字領域７０４の文字切り矩形であり、７１１、７１２は、文字領域７０６の文字切り矩形である。ここで、７１１、７１２に示すように文字領域内の文字切り矩形の中に、文字と写真が混在することがある。例えば、特許文献１のように画素の集まりを位置の近さやサイズの一致に基づいてグループ化した場合、文字サイズに近い写真領域が文字領域内に混在する場合がある。これらの矩形全てを文字として扱うと、７１２のような本来写真として扱うべき矩形は、２値画像として処理が行われるため、情報の欠落が生じる。仮に文字領域内の文字切り矩形を全て文字として扱った場合に生成される圧縮データ（５１７）または、ＰＤＦ高圧縮データを７１３に示す。ここで、７１４に示すように本来階調や色を有する写真領域が文字領域として扱われて２値化されてしまい、情報欠落が生じることになる。 Reference numeral 709 denotes a schematic diagram of a character cut rectangle cut out by the character cutout unit (505). Reference numeral 710 denotes a character cut rectangle of the character region 704, and reference numerals 711 and 712 denote character cut rectangles of the character region 706. Here, as indicated by reference numerals 711 and 712, characters and photographs may be mixed in a character cut rectangle in the character area. For example, as in Patent Document 1, when a group of pixels is grouped based on the closeness of the position and the matching of the sizes, a photo area close to the character size may be mixed in the character area. When all of these rectangles are handled as characters, the rectangle that should be treated as a photograph, such as 712, is processed as a binary image, and information is lost. Reference numeral 713 denotes compressed data (517) or PDF high-compressed data generated when all the character-cut rectangles in the character area are handled as characters. Here, as shown at 714, a photographic region originally having gradation and color is treated as a character region and binarized, resulting in information loss.

これらの問題点を解決するために、本発明では、図８に示すように領域判定部２（８０１）を更に設け、文字切り矩形の領域判定を行う。その他の構成要素は図５と同様である。 In order to solve these problems, in the present invention, an area determination unit 2 (801) is further provided as shown in FIG. Other components are the same as those in FIG.

次に、図９のフローチャートを用いて、本発明のポイントである領域判定部２（８０１）の説明を行う。ここで、図９のフローチャートは、図８の処理の一部であるため、図８を適宜参照する。また、領域判定部２（８０１）は、図９の９０７〜９１２の処理を示す。 Next, the area determination unit 2 (801), which is the point of the present invention, will be described using the flowchart of FIG. Here, the flowchart of FIG. 9 is a part of the processing of FIG. 8, and therefore FIG. 8 is referred to as appropriate. Further, the area determination unit 2 (801) shows the processing of 907 to 912 in FIG.

次に、図９のフローチャートを用いて本発明のポイントである領域判定部２（８０１）の説明を行う。ここで、図９のフローチャートは、図５の処理の一部であるため、図５を適宜参照する。また、領域判定部２（８０１）は、図９の９１７の破線で囲われる９０７〜９１２の処理を示す。 Next, the area determination unit 2 (801), which is the point of the present invention, will be described using the flowchart of FIG. Here, the flowchart of FIG. 9 is a part of the processing of FIG. Further, the area determination unit 2 (801) indicates the processing of 907 to 912 surrounded by a broken line 917 in FIG.

まず、ステップ９０１にて、入力画像（５０１）に対して２値化部（５０２）により２値化を行う。 First, in step 901, the binarization unit (502) binarizes the input image (501).

次に、ステップ９０２にて、２値画像（５０３）に対して領域判定部（５０４）により、領域判定を行う。ステップ９０２での領域判定は、例えば、２値画像において輪郭線追跡を行うことによって画素塊を取得し、近くの画素塊同士をグループ化することにより分割されてしまっている文字や文字行が結合されることになる。このグループ化によって形成された領域に含まれる画素塊の大きさや位置関係などに基づいて、当該領域が１又は複数の文字を含む文字領域かどうかの判定が行われる。 Next, in step 902, the area determination unit (504) performs area determination on the binary image (503). The area determination in step 902 is performed by, for example, acquiring a pixel block by performing contour line tracking in a binary image and combining characters and character lines that have been divided by grouping nearby pixel blocks. Will be. Based on the size and positional relationship of the pixel block included in the region formed by the grouping, it is determined whether the region is a character region including one or more characters.

次に、ステップ９０３にて、領域数のカウンタであるｎを初期化する。次に、ステップ９０４にて、注目領域が文字領域と判定された領域である場合は、ステップ９０５へ、非文字領域と判定された領域である場合は、ステップ９１２へ進む。 In step 903, n, which is a counter for the number of areas, is initialized. Next, in step 904, if the attention area is an area determined to be a character area, the process proceeds to step 905. If the attention area is determined to be a non-character area, the process proceeds to step 912.

ステップ９０５では、文字切り出し部（５０５）にて文字切り出しを行う。例えば、水平方向にヒストグラムを取って文字行を切り出し、各文字行の垂直方向のヒストグラムを取って文字矩形を切り出すことができる。 In step 905, the character cutout unit (505) cuts out the character. For example, a character line can be cut out by taking a histogram in the horizontal direction, and a character rectangle can be cut out by taking a histogram in the vertical direction of each character line.

ステップ９０６にて、文字切り矩形数のカウンタであるｍを初期化する。 In step 906, m, which is a counter for the number of rectangles for character cutting, is initialized.

次に、領域判定部２（８０１）において、まず、ステップ９０７にて、ステップ９０５で切り出された文字切り矩形の細線化（１）を行う。 Next, in the area determination unit 2 (801), first, in step 907, the character cut rectangle cut out in step 905 is thinned (1).

ここで、細線化（１）の処理について具体的に説明する。 Here, the thinning (1) process will be specifically described.

細線化（１）は、入力の２値画像の文字切り矩形の領域に対して細線化を行う処理である。細線化方法は、まず、横に連結する黒画素を検出し、左右両端の１画素づつを削除する（白画素に置き換える）。次に、縦に連結する黒画素を検出し、上下両端の１画素づつを削除する。例えば、図１４に示す２値画像１４０１は、横に連結する黒画素の検出及び左右両端の１画素の削除により、１４０２に示す画像となる。次に、縦に連結する黒画素の検出及び上下両端の１画素の削除により、１４０３に示す画像となる。このようにして生成された細線化後の画像は、後述する色分散値を算出する際に使用するための一時的なものであるため、テンポラリの記憶領域に保存される。 Thinning (1) is a process of thinning a character-cut rectangular area of an input binary image. In the thinning method, first, black pixels connected horizontally are detected, and one pixel at each of the left and right ends is deleted (replaced with white pixels). Next, vertically connected black pixels are detected, and one pixel at both the upper and lower ends is deleted. For example, a binary image 1401 shown in FIG. 14 becomes an image shown by 1402 by detecting a black pixel connected horizontally and deleting one pixel at both left and right ends. Next, an image shown in 1403 is obtained by detecting black pixels connected vertically and deleting one pixel at both upper and lower ends. The thinned image generated in this way is temporary for use in calculating a chromatic dispersion value, which will be described later, and thus is stored in a temporary storage area.

また、図１１の１１０１のような２値画像の文字切り矩形が入力された場合に、細線化処理（１）した後の画像を１１０４に示す。１１０１は、入力の２値画像の文字切り矩形であり、文字“Ｐ”が四角枠で囲われている。四角枠の幅は、１１０２に示すように２画素幅であり、文字“Ｐ”の太さは、１１０３に示す画素幅である。１１０４は、細線化後の画像であり、細線化を行うと、周囲の四角枠は１１０５に示すように削除され、文字“Ｐ”は、１１０６に示す画素幅となる。 In addition, an image after thinning processing (1) when a character-cut rectangle of a binary image such as 1101 in FIG. Reference numeral 1101 denotes a character-cut rectangle of the input binary image, and the character “P” is surrounded by a square frame. The width of the rectangular frame is 2 pixels wide as indicated by 1102, and the thickness of the character “P” is the pixel width indicated by 1103. Reference numeral 1104 denotes an image after thinning. When thinning is performed, the surrounding square frame is deleted as indicated by 1105, and the character “P” has the pixel width indicated by 1106.

次に、ステップ９０８にて、ステップ９０５で切り出された文字切り矩形の細線化（２）を行う。 Next, in step 908, the character cut rectangle cut out in step 905 is thinned (2).

ここで、細線化（２）の処理について具体的に説明する。 Here, the thinning (2) process will be specifically described.

細線化（２）は、細線化（１）と同様に、入力された２値画像の文字切り矩形の領域に対して細線化を行う処理である。細線化（２）では、連結する黒画素を検出し、連結画素数にもとづいて削除する画素数を切り替える点が細線化（１）と異なる。以下に、連結画素数と削除画素数の関係を示す。
０≦連結画素数≦２：削除しない。
３≦連結画素数≦６：両端１画素削除
７≦連結画素数：両端２画素削除
例えば、図１４に示す２値画像１４０１に細線化（２）の処理を行うと、上述した連結画素数と削除画素数の関係に基づいて、横に連結する黒画素の検出及び左右両端の画素の削除により、１４０４に示す画像となる。次に、縦に連結する黒画素の検出及び上下両端の画素の削除により、１４０５に示す画像となる。このようにして生成された細線化後の画像は、テンポラリの記憶領域に保存される。また、図１１の１１０１のような文字切り矩形に、細線化処理（２）を行った後の画像を図１１の１１０７に示す。細線化（２）では、四角枠は、１１０８に示すように削除されず、文字“Ｐ”は、１１０９に示す画素幅となる。 Thinning (2) is a process for thinning a character-cut rectangular area of an input binary image, as in thinning (1). Thinning (2) is different from thinning (1) in that the black pixels to be connected are detected and the number of pixels to be deleted is switched based on the number of connected pixels. The relationship between the number of connected pixels and the number of deleted pixels is shown below.
0 ≦ number of connected pixels ≦ 2: not deleted.
3 ≦ number of connected pixels ≦ 6: 1 pixel deleted at both ends 7 ≦ number of connected pixels: 2 pixels deleted at both ends For example, when thinning (2) is performed on the binary image 1401 shown in FIG. Based on the relationship of the number of deleted pixels, the image shown in 1404 is obtained by detecting the black pixels connected horizontally and deleting the pixels at both the left and right ends. Next, an image shown in 1405 is obtained by detecting black pixels connected vertically and deleting pixels at both upper and lower ends. The thinned image generated in this way is stored in a temporary storage area. Further, an image after the thinning process (2) is performed on a character cut rectangle such as 1101 in FIG. 11 is shown in 1107 in FIG. In thinning (2), the square frame is not deleted as indicated by 1108, and the character “P” has the pixel width indicated by 1109.

以上のように、ステップ９０７、ステップ９０８にて、細線化（１）、細線化（２）が行われる。なお、ステップ９０７とステップ９０８とを並列に実行することができれば処理時間が短縮されるようになるが、順番に処理を実行するようにしても構わない。 As described above, in steps 907 and 908, thinning (1) and thinning (2) are performed. Note that if step 907 and step 908 can be executed in parallel, the processing time will be shortened, but the processing may be executed in order.

次に、ステップ９０９にて、前述した細線化（１）、細線化（２）で得られた各々の画像に対して、色分散値（ＣｏｌｏｒＳｔｄ＿１、ＣｏｌｏｒＳｔｄ＿２）それぞれを算出する。 Next, in step 909, chromatic dispersion values (ColorStd_1, ColorStd_2) are calculated for the respective images obtained by the above-described thinning (1) and thinning (2).

ここで、色分散について具体的に説明する。本実施例における色分散は、文字切り矩形内の色が、単一色（例えば、黒や赤）であるか複数色（例えば、黒と赤が混在）であるかを判定するための基準として用いる。例えば、色分散値が小さい場合は、単一色である可能性が高いと判定し、色分散値が大きい場合は、複数色であると判定できる。本実施例では、文字切り矩形内の色が、色分散値の大きさを文字か写真という判定の基準に用いており、これは、文字は単一色である場合が多く、写真（自然画やイラスト）は複数色である場合が多いという経験則によるものである。 Here, the color dispersion will be specifically described. The color dispersion in this embodiment is used as a reference for determining whether the color in the character cut rectangle is a single color (for example, black or red) or a plurality of colors (for example, a mixture of black and red). . For example, when the color dispersion value is small, it is determined that there is a high possibility of a single color, and when the color dispersion value is large, it can be determined that there are a plurality of colors. In the present embodiment, the color in the character cut rectangle uses the size of the color dispersion value as a criterion for determining whether it is a character or a photo. The illustration is based on the rule of thumb that there are many cases of multiple colors.

次に、色分散値の算出方法について具体的に説明する。色情報は、縮小多値画像（５１３）または、２値化処理前の入力画像（５０１）の色（例ＲＧＢ値の各８ｂｉｔ）を参照する。さらに、好ましくは、ＲＧＢ値を輝度、輝度色差情報に変換した値（例えば、ＹＣｂＣｒ値の各８ｂｉｔ）を参照する。ここでは、例としてＹＣｂＣｒ値のＣｂ値の色分散値を算出する。ＲＧＢからＹＣｂＣｒへの変換方法については、公知であるため、説明を省略する。 Next, a method for calculating the chromatic dispersion value will be specifically described. The color information refers to the color of the reduced multi-value image (513) or the input image (501) before binarization processing (e.g., 8 bits for each RGB value). Further, preferably, a value obtained by converting the RGB value into luminance and luminance color difference information (for example, 8 bits for each YCbCr value) is referred to. Here, as an example, the chromatic dispersion value of the Cb value of the YCbCr value is calculated. Since the conversion method from RGB to YCbCr is well-known, description is abbreviate | omitted.

まず、細線化（１）で得られる細線化画像１１０４の黒画素の位置に対応する縮小多値画像（５１３）の色（Ｃｂ値）に基づいて、Ｃｂ値と出現頻度の分布図を生成する。例えば、図１２は、図１１の黒画素を画素単位で表した図であるが、１２０１の画素のＣｂ値、１２０２の画素のＣｂ値を順に参照し、分布図を生成する。このように生成された分布図にもとづいて分散値を算出する。 First, based on the color (Cb value) of the reduced multi-valued image (513) corresponding to the position of the black pixel of the thinned image 1104 obtained by thinning (1), a distribution diagram of Cb values and appearance frequencies is generated. . For example, FIG. 12 is a diagram in which the black pixels in FIG. 11 are represented in units of pixels, and the distribution diagram is generated by sequentially referring to the Cb values of the pixels 1201 and 1202. A variance value is calculated based on the distribution map generated in this way.

分散は、一般的に知られるｖａｒｉａｎｃｅであり、下記の式で求められる。
・ｖａｒｉａｎｃｅ（分散）：Σ（Ｃｂ（ｉ）−ｍ）^２／ｎ
・ｎ：データ数（文字切り矩形内の黒画素数）
・Ｃｂ（ｉ）：文字切り矩形内の黒画素と位置的に対応する縮小多値画像のＣｂ値
・ｍ（平均）：ΣＣｂ（ｉ）／ｎ
以上のようにして色分散値が算出される。ここで、細線化（１）、細線化（２）で得られた各々の画像に対する色分散値を夫々、ＣｏｌｏｒＳｔｄ＿１、ＣｏｌｏｒＳｔｄ＿２とする。また、ここでは、例として、Ｃｂ値の色分散値を算出しているが、Ｃｒ値、Ｒ、Ｇ、Ｂ値にもとづいて色分散値を算出しても構わない。また、色分散値の算出後は、前述したテンポラリの記憶領域を初期化する。 The variance is a generally known variation, and is obtained by the following equation.
Variation (dispersion): Σ (Cb (i) −m) ² / n
N: Number of data (number of black pixels in the character cut rectangle)
Cb (i): Cb value of the reduced multi-valued image corresponding to the black pixel in the character-cut rectangle. M (average): ΣCb (i) / n
The chromatic dispersion value is calculated as described above. Here, chromatic dispersion values for the respective images obtained by thinning (1) and thinning (2) are ColorStd_1 and ColorStd_2, respectively. Further, here, as an example, the color dispersion value of the Cb value is calculated, but the color dispersion value may be calculated based on the Cr value, R, G, and B values. Further, after calculating the color dispersion value, the temporary storage area described above is initialized.

次に、色分散を細線化画像より算出する理由について説明する。前述したように、色分散は、縮小多値画像（５１３）または、２値化処理前の入力画像（５０１）を参照する。細線化は、この時、参照するカラー多値画像の品位による影響を軽減させるために行う。即ち、カラー多値画像が圧縮や色ずれ等の要因により劣化している場合、文字切り矩形内の文字の本来の色分散値に影響を与えてしまうためである。 Next, the reason why the chromatic dispersion is calculated from the thinned image will be described. As described above, the color dispersion refers to the reduced multi-value image (513) or the input image (501) before the binarization process. Thinning is performed at this time in order to reduce the influence of the quality of the color multilevel image to be referred to. That is, when the color multi-valued image is deteriorated due to factors such as compression or color misregistration, the original color dispersion value of characters in the character cut rectangle is affected.

次に、色分散を２種類の細線化画像より算出する理由について説明する。以下、細線化（１）、及び細線化（２）より色分散を算出する場合の長所、短所について述べる。
（Ａ）細線化（１）より色分散を算出する場合
・長所：文字切り矩形内の文字、または写真の本来の色分散値、即ち精度の高い色分散値が得られる（但し、短所で述べる例外を除く）。
・短所：２画素幅の黒画素は、削除されてしまうため、２画素幅を多くもつ画像の色分散値の精度が低い。 Next, the reason for calculating chromatic dispersion from two types of thinned images will be described. Hereinafter, advantages and disadvantages of calculating chromatic dispersion from thinning (1) and thinning (2) will be described.
(A) When color dispersion is calculated from thinning (1) Advantages: Original color dispersion value of characters in a character-cut rectangle or a photograph, that is, a highly accurate color dispersion value can be obtained (however, the disadvantages will be described) Except exceptions).
Disadvantages: Since black pixels with a width of 2 pixels are deleted, the accuracy of the color dispersion value of an image with a large width of 2 pixels is low.

ここで、細線化（１）より色分散を算出する場合の長所について、図１４を用いて具体的に説明する。前述したように１４０３は、１４０１に対して細線化（１）を行った場合の画像、１４０５は、１４０１に対して細線化（２）を行った場合の画像である。ここで、細線化（１）の場合の１４０３は、画像の芯（内部）しか残らないため、カラー多値画像の品位による影響を受けにくい。一方、細線化（２）の場合の１４０５は、画像の芯以外の部分も残るため、これらの影響を受けやすい。
（Ｂ）細線化（２）より色分散を算出する場合
・長所：（Ａ）に比べて、２画素幅を多くもつ写真の精度の高い色分散値が得られる。
・短所：（Ａ）に比べて、色分散値の精度が低い。 Here, the advantages of calculating the chromatic dispersion from the thinning (1) will be specifically described with reference to FIG. As described above, 1403 is an image when 1401 is thinned (1), and 1405 is an image when 1401 is thinned (2). Here, since 1403 in the case of thinning (1) remains only the core (inside) of the image, it is hardly affected by the quality of the color multi-valued image. On the other hand, in the case of thinning (2), 1405 is likely to be affected by these because the portion other than the core of the image remains.
(B) When Color Dispersion is Calculated from Thinning (2) Advantages: A highly accurate color dispersion value of a photograph having a width of two pixels compared to (A) can be obtained.
Disadvantages: The accuracy of chromatic dispersion values is lower than in (A).

ここで、細線化（２）より色分散を算出する場合の長所について、図１３を用いて具体的に説明する。１３０１は、カラー多値画像であり、赤色丸枠の中に黒（グラデーション）の携帯が描かれている。１３０２は、１３０１を２値化した２値画像であり、丸枠は２画素幅の黒画素であることを示している。１３０３は、１３０２に対して細線化（１）を行った場合の画像、１３０４は、１３０２に対して細線化（２）を行った場合の画像である。ここで、細線化（１）の場合の１３０３は、丸枠の部分が削除されてしまうため、色分散値は小さくなる。細線化（２）の場合の１３０４は、丸枠の部分が残り、色分散値は大きくなる。従って、細線化（１）だけでは、該矩形を写真と判定できない場合があるため、細線化（２）が必要となる。 Here, the advantages of calculating the chromatic dispersion from the thinning (2) will be specifically described with reference to FIG. Reference numeral 1301 denotes a color multivalued image, in which a mobile phone of black (gradation) is drawn in a red round frame. Reference numeral 1302 denotes a binary image obtained by binarizing 1301, and a round frame indicates a black pixel having a width of two pixels. 1303 is an image when thinning (1) is performed on 1302, and 1304 is an image when thinning (2) is performed on 1302. Here, in the case of 1303 in the case of thinning (1), since the round frame portion is deleted, the color dispersion value becomes small. In the case of 1304 in the case of thinning (2), a round frame portion remains and the color dispersion value becomes large. Therefore, since the rectangle may not be determined as a photograph only by thinning (1), thinning (2) is necessary.

次に、ステップ９１０にて、色分散値（ＣｏｌｏｒＳｔｄ＿１、ＣｏｌｏｒＳｔｄ＿２）と予め設定された閾値（ｔｈ１、ｔｈ２）を比較する。ＣｏｌｏｒＳｔｄ＿１、ＣｏｌｏｒＳｔｄ＿２のいずれも閾値よりも小さい場合は、ステップ９１１にてＴＥＸＴと判断され、いずれかが閾値よりも大きい場合は、ステップ９１２にてＩＭＡＧＥと判断される。次に、ステップ９１３にて、文字切り矩形数のカウンタｍと文字切り矩形数Ｍの比較を行い、全ての文字切り矩形に対してステップ９０７〜９１２が終了するとステップ９１４へ進み、一方、未処理の文字切り矩形がある場合は、ステップ９１５にてカウンタｍを増やして次の文字切り矩形に対して処理を行う。また、ステップ９１４にて、領域数のカウンタｎと領域数Ｎの比較を行い、全ての領域に対する処理が終了すると本処理を終了し、未処理の領域がある場合は、ステップ９１６にてカウンタｎを増やして次の領域の処理を行う。 Next, in step 910, the chromatic dispersion values (ColorStd_1, ColorStd_2) are compared with preset threshold values (th1, th2). If both ColorStd_1 and ColorStd_2 are smaller than the threshold value, it is determined as TEXT in step 911, and if either is larger than the threshold value, it is determined as IMAGE in step 912. Next, in step 913, the character cut rectangle count m is compared with the character cut rectangle number M. When steps 907 to 912 are completed for all the character cut rectangles, the flow proceeds to step 914, while unprocessed In step 915, the counter m is incremented and processing is performed on the next character cut rectangle. In step 914, the counter n for the number of areas is compared with the number N for the areas. When all the areas have been processed, this process is terminated. If there is an unprocessed area, the counter n is counted in step 916. Is increased to process the next area.

このように、領域判定部２（８０１）では、文字領域における文字切り矩形がＴＥＸＴであるかＩＭＡＧＥであるかの判定を色分散値を用いて行う。 As described above, the area determination unit 2 (801) determines whether the character cut rectangle in the character area is TEXT or IMAGE using the chromatic dispersion value.

例えば、入力画像（５０１）が図７の７０１である場合、領域判定部２（８０１）で判定された結果の概略図を図１０の１００１に示す。また、図８の８０２は、この結果にもとづいて、領域が文字領域かつ領域判定部２（８０１）の結果がＴＥＸＴである文字切り矩形領域を用いて部分２値画像を生成することを示している。例えば、入力画像（５０１）が図７の７０１であった場合に生成されるテキスト領域の部分２値画像の概略図を１００２に示す。領域判定部２の処理を行った場合、ＴＥＸＴ７１１とＩＭＡＧＥ７１２とが識別されるので、生成される圧縮データ（５１７）またはＰＤＦ高圧縮データを再生すると、１００３のようになる。 For example, when the input image (501) is 701 in FIG. 7, a schematic diagram of a result determined by the region determination unit 2 (801) is shown in 1001 in FIG. Further, reference numeral 802 in FIG. 8 indicates that, based on this result, a partial binary image is generated using a character-cut rectangular area in which the area is a character area and the result of the area determination unit 2 (801) is TEXT. Yes. For example, 1002 is a schematic diagram of a partial binary image of a text area generated when the input image (501) is 701 in FIG. When the processing of the area determination unit 2 is performed, the TEXT 711 and the IMAGE 712 are identified. When the generated compressed data (517) or PDF high-compressed data is reproduced, the result is 1003.

以上のように、領域判定部２（８０１）において、２種類の細線化手法による文字切り矩形の色分散値にもとづいて、文字切り矩形が文字か写真かを精度良く判定することができる。この判定結果を圧縮に適用することにより、良好な画質の圧縮データ（５１７）または、ＰＤＦ高圧縮データを得ることが可能となる。 As described above, the area determination unit 2 (801) can accurately determine whether the character cut rectangle is a character or a photograph based on the color dispersion value of the character cut rectangle by two types of thinning methods. By applying this determination result to compression, it is possible to obtain compressed data (517) with good image quality or PDF high-compression data.

（実施例２）
実施例１では、２種類の細線化手法による文字切り矩形の色分散値を計算した後に、文字切り矩形が文字か写真かを判定する場合について説明した。実施例２では、さらに、第一の細線化手法による文字切り矩形の色分散値にもとづいて、第二の細線化手法による文字切り矩形の色分散値の算出を行うかどうか判断する。 (Example 2)
In the first embodiment, the case where it is determined whether the character cut rectangle is a character or a photo after calculating the color dispersion value of the character cut rectangle by two types of thinning methods has been described. In the second embodiment, it is further determined whether or not to calculate the color dispersion value of the character-cutting rectangle by the second thinning method, based on the color dispersion value of the character-cutting rectangle by the first thinning method.

以下、実施例２における領域判定方法について図１５のフローチャートを用いて説明する。 Hereinafter, the region determination method according to the second embodiment will be described with reference to the flowchart of FIG.

図１５は、実施例１で使用した図９の９１７に対応する領域判定部２（８０１）である。まず、ステップ１５０１にて、実施例１で前述した細線化（１）を行う。次に、ステップ１５０２にて色分散値（ＣｏｌｏｒＳｔｄ＿１）を算出する。次に、ステップ１５０３にて、色分散値（ＣｏｌｏｒＳｔｄ＿１）と予め設定された第１の閾値ｔｈ１を比較する。ここで、閾値よりも小さい場合は、ステップ１５０４に進み、閾値よりも大きい場合は、ステップ１５０８にてＩＭＡＧＥと判断される。ステップ１５０４では、実施例１で前述した細線化（２）を行う。次に、ステップ１５０５にて色分散値（ＣｏｌｏｒＳｔｄ＿２）を算出する。次に、ステップ１５０６にて、色分散値（ＣｏｌｏｒＳｔｄ＿２）と予め設定された第２の閾値ｔｈ２を比較する。ここで、閾値よりも小さい場合は、ステップ１５０７にて、ＴＥＸＴと判断される。閾値よりも大きい場合は、ステップ１５０８にてＩＭＡＧＥと判断される。 FIG. 15 shows an area determination unit 2 (801) corresponding to 917 in FIG. 9 used in the first embodiment. First, in step 1501, thinning (1) described in the first embodiment is performed. Next, in step 1502, a chromatic dispersion value (ColorStd_1) is calculated. Next, in step 1503, the chromatic dispersion value (ColorStd_1) is compared with a preset first threshold th1. If it is smaller than the threshold value, the process proceeds to step 1504. If it is larger than the threshold value, it is determined in step 1508 that the image is IMAGE. In step 1504, the thinning (2) described in the first embodiment is performed. Next, in step 1505, a chromatic dispersion value (ColorStd_2) is calculated. Next, in step 1506, the chromatic dispersion value (ColorStd_2) is compared with a preset second threshold th2. If it is smaller than the threshold value, it is determined in step 1507 that the text is TEXT. If it is greater than the threshold, it is determined in step 1508 that IMAGE.

以上のように、第一の細線化手法による文字切り矩形の色分散値にもとづいて、第二の細線化手法による文字切り矩形の色分散値の算出を行うかどうか判断することにより、実施例１と同様の精度を保ちつつ、より高速な領域判定が可能となる。 As described above, by determining whether to calculate the color dispersion value of the character-cut rectangle by the second thinning method based on the color dispersion value of the character-cut rectangle by the first thinning method, While maintaining the same accuracy as 1, higher-speed region determination is possible.

（実施例３）
実施例１〜２では、画像の圧縮技術において本領域判定方法を用いる例を説明した。実施例３では、光学的文字認識装置（ＯＣＲ）技術を用いる際に、本領域判定方法を用いる例を説明する。 (Example 3)
In the first and second embodiments, the example in which the region determination method is used in the image compression technique has been described. In the third embodiment, an example in which this region determination method is used when an optical character recognition device (OCR) technique is used will be described.

前述したように、ＯＣＲ処理では、文書画像に対して濃度射影（ヒストグラム）を取ることにより文字行を切り出し（抽出）、さらに１文字単位の文字ブロック切り出し（抽出）を行う。そして、個々の文字ブロックから特徴データを抽出して、標準パタンとの類似度が計算され、最も類似度の高い文字を認識結果として出力する。即ち、文字ブロック切り出し（抽出）処理までは、実施例１の図９で前述したように、２値化、領域判定、文字切り出しを行うことを意味する。また、前述したように、文字切り矩形が非文字である場合、非文字に対して文字認識を行ってしまうと、全体の処理速度を低下させてしまう他、意味のないテキストコードを出力してしまう場合もあり好ましくない。 As described above, in the OCR process, a character line is cut out (extracted) by taking a density projection (histogram) on the document image, and further, a character block cut out (extracted) in units of one character. Then, the feature data is extracted from each character block, the similarity with the standard pattern is calculated, and the character with the highest similarity is output as the recognition result. That is, up to the character block cutout (extraction) process means binarization, area determination, and character cutout as described above with reference to FIG. In addition, as described above, when the character cut rectangle is non-character, if character recognition is performed on the non-character, the overall processing speed is reduced, and a meaningless text code is output. This is not preferable.

ここで、実施例３では、ＯＣＲ処理を行う前に、予め文字切り矩形の領域判定を行うことで、文字か非文字かを判定しておき、文字と判定された場合のみＯＣＲ処理を行うことでこれらの問題を解決する。この処理を図１６のフローチャートに示す。図１６において、９０１〜９１６の処理部分は、実施例１で前述した図９と同様であるため、説明を省略する。ステップ９１０にて、色分散値（ＣｏｌｏｒＳｔｄ＿１、ＣｏｌｏｒＳｔｄ＿２）と予め設定された閾値（ｔｈ１、ｔｈ２）を比較し、ＣｏｌｏｒＳｔｄ＿１、ＣｏｌｏｒＳｔｄ＿２のいずれも閾値よりも小さい場合は、ステップ９１１にてＴＥＸＴと判断されるため、ステップ１６０１にてＯＣＲ処理を行い、文字認識結果を出力する。また、いずれかが閾値よりも大きい場合は、ステップ９１０にてＩＭＡＧＥと判断されるため、ＯＣＲ処理は行わない。 Here, in the third embodiment, before performing the OCR process, the character cut rectangle area is determined in advance to determine whether it is a character or a non-character, and the OCR process is performed only when the character is determined. To solve these problems. This process is shown in the flowchart of FIG. In FIG. 16, the processing portions 901 to 916 are the same as those in FIG. In step 910, the color dispersion values (ColorStd_1, ColorStd_2) are compared with preset threshold values (th1, th2). If both ColorStd_1 and ColorStd_2 are smaller than the threshold values, TEXT is determined in step 911. Therefore, OCR processing is performed in step 1601, and a character recognition result is output. If any of the values is larger than the threshold value, it is determined as IMAGE in step 910, and thus the OCR process is not performed.

以上のように、ＯＣＲ技術を用いる際、不要なＯＣＲ処理を行わないため、処理速度の向上が図られ、また、意味のないテキストコードを出力してしまうことを抑えることが可能となる。 As described above, when the OCR technique is used, unnecessary OCR processing is not performed, so that the processing speed is improved, and it is possible to suppress the output of meaningless text codes.

（実施例４）
本発明は、複数の機器（例えばホストコンピュータ、インターフェース機器、リーダ、プリンタなど）から構成されるシステムに適用しても、一つの機器からなる装置（例えば、複写機、ファクシミリ装置など）に適用してもよい。 Example 4
The present invention can be applied to a system composed of a plurality of devices (for example, a host computer, an interface device, a reader, a printer, etc.) or an apparatus composed of a single device (for example, a copier, a facsimile machine, etc.). May be.

また、本発明の目的は、前述した実施形態の機能を実現するソフトウェアのプログラムコードを記録した記録媒体を、システムあるいは装置に供給し、そのシステムあるいは装置のコンピュータ（またはＣＰＵやＭＰＵ）が記録媒体に格納されたプログラムコードを読み出し実行することによっても、達成されることは言うまでもない。この場合、記録媒体から読み出されたプログラムコード自体が本発明の新規な機能を実現することになり、そのプログラムコードを記憶した記憶媒体は本発明を構成することになる。プログラムコードを供給するための記憶媒体としては、例えば、フロッピー（登録商標）ディスク、ハードディスク、光ディスク、光磁気ディスク、ＣＤ−ＲＯＭ、ＣＤ−Ｒ、磁気テープ、不揮発性のメモリカード、ＲＯＭなどを用いることができる。 Another object of the present invention is to supply a recording medium recording a program code of software that implements the functions of the above-described embodiments to a system or apparatus, and the computer (or CPU or MPU) of the system or apparatus records the recording medium. Needless to say, this can also be achieved by reading and executing the program code stored in. In this case, the program code itself read from the recording medium realizes the novel function of the present invention, and the storage medium storing the program code constitutes the present invention. As a storage medium for supplying the program code, for example, a floppy (registered trademark) disk, hard disk, optical disk, magneto-optical disk, CD-ROM, CD-R, magnetic tape, nonvolatile memory card, ROM, or the like is used. be able to.

また、コンピュータが読み出したプログラムコードを実行することにより、前述した実施形態の機能が実現されるだけでなく、そのプログラムコードの指示に基づき、コンピュータ上で稼動しているＯＳなどが実際の処理の一部または全部を行い、その処理によって前述した実施形態の機能が実現される場合も含まれることは言うまでもない。 Further, by executing the program code read by the computer, not only the functions of the above-described embodiments are realized, but also the OS running on the computer based on the instruction of the program code performs the actual processing. Needless to say, a case where the function of the above-described embodiment is realized by performing part or all of the processing is also included.

さらに、記録媒体から読み出されたプログラムコードが、コンピュータに挿入された拡張機能ボードやコンピュータに接続された機能拡張ユニットに備わるメモリに書き込まれた後、そのプログラムコードに指示に基づき、その機能拡張ボードや機能拡張ユニットに備わるＣＰＵなどが実際の処理の一部または全部を行い、その処理によって前述した実施形態の機能が実現される場合も含まれることは言うまでもない。 Further, after the program code read from the recording medium is written in the memory provided in the extension function board inserted in the computer or the function extension unit connected to the computer, the function extension is performed based on the instruction in the program code. It goes without saying that the CPU or the like provided in the board or the function expansion unit performs part or all of the actual processing, and the functions of the above-described embodiments are realized by the processing.

以上のように、本発明によれば、領域判定精度を向上させることが可能となる。また文字と非文字の好適な領域判定を実行することができるので、良好な画質が得られると共に、圧縮効率を向上させることが可能となる。また、ＯＣＲ技術を用いる際、処理速度の向上と共に、意味のないテキストコードを出力してしまうことを抑え、認識率を向上させることが可能となる。 As described above, according to the present invention, it is possible to improve the region determination accuracy. In addition, since it is possible to execute suitable region determination for characters and non-characters, it is possible to obtain good image quality and improve compression efficiency. Further, when using the OCR technology, it is possible to improve the recognition rate by suppressing the output of meaningless text codes as well as the processing speed.

本実施例１におけるシステムの概略図Schematic diagram of the system in the first embodiment 本実施例１におけるＭＦＰのハードウェア構成Hardware configuration of MFP in the first embodiment 本実施例１におけるＭＦＰのソフトウェア構成Software configuration of MFP in the first embodiment 本実施例１におけるＰＣのハードウェア構成Hardware configuration of PC in the first embodiment 画像圧縮装置のブロック図１Block diagram of image compression apparatus 1 画像伸長装置のブロック図Block diagram of image decompression device 入力画像〜出力画像のサンプルSample of input image to output image 本発明における画像圧縮装置のブロック図２Block diagram of an image compression apparatus according to the present invention 2 本実施例１における領域判定のフローチャートFlow chart of area determination in the first embodiment 本実施例１における入力画像〜出力画像のサンプル２Sample 2 of input image to output image in the first embodiment 細線化、色分散の説明図１Explanation of thinning and chromatic dispersion 1 細線化、色分散の説明図２Explanation of thinning and chromatic dispersion 2 細線化、色分散の説明図３Explanation of thinning and chromatic dispersion 3 細線化の説明図Illustration of thinning 本実施例２における領域判定のフローチャートFlow chart of region determination in the second embodiment 本実施例３における領域判定のフローチャートFlow chart of area determination in the third embodiment

Claims

Binarization means for generating a binary document image from a multi-valued document image;
Area extracting means for extracting a determination target area from the generated binary image;
First thinning means for thinning the image of the determination target region by a first thinning process;
Second thinning means for thinning the image of the determination target area by a second thinning process;
First color dispersion value calculating means for calculating a first color dispersion value of the multi-value document image with respect to a position corresponding to a thinning result by the first thinning means;
Second color dispersion value calculating means for calculating a second color dispersion value of the multi-value document image with respect to a position corresponding to a thinning result by the second thinning means;
An image processing apparatus comprising: area determination means for determining whether the determination target area is a character or a photograph based on the calculated first color dispersion value and second color dispersion value.

A pre-determination unit that determines a character region candidate from the image based on a pixel block included in the document image;
The image processing apparatus according to claim 1, wherein the region extraction unit extracts a rectangular region of a character included in the character region candidate determined by the pre-determination unit as the determination target region.

The first thinning means detects black pixels connected in the horizontal direction and deletes a predetermined number of pixels at both left and right ends, and further detects black pixels connected in the vertical direction to detect a predetermined number of pixels at the upper and lower ends. The image processing apparatus according to claim 1, wherein a thinned image is obtained by deleting pixels.

The second thinning means detects a black pixel connected in the horizontal direction, performs a deletion process on both left and right pixels in accordance with the number of the connected black pixels, and further adds a black pixel connected in the vertical direction. The image processing apparatus according to claim 1, wherein a thinned image is obtained by detecting and executing deletion processing of pixels at both upper and lower ends according to the number of connected black pixels.

The second thinning means does not delete when the number of connected black pixels is 2 or less, deletes one pixel at both ends when the number of connected black pixels is 3 to 6, and the number of connected black pixels is 7 5. The image processing apparatus according to claim 4, wherein two pixels at both ends are deleted in the above case.

The area determination unit is configured to output the calculated first color dispersion value smaller than a predetermined first threshold value and the calculated second color dispersion value smaller than a predetermined second threshold value. The image processing apparatus according to claim 1, wherein the determination target region is determined as a character.

Compression that obtains compressed data of the document image by performing a first compression process on the area determined to be a character by the area determination means and performing a second compression process on the area determined to be a photograph The image processing apparatus according to claim 1, further comprising means.

The image processing apparatus according to claim 7, wherein the first compression process is a compression process suitable for a binary image, and the second compression process is a compression process suitable for a multi-valued image.

An image processing apparatus further comprising representative color calculation means for calculating a representative color from an area determined as a character by the area determination means,
The compressed data includes a first compressed code obtained by the first compression process, a second compressed code obtained by the second compression process, and representative color information obtained by the representative color calculation means. The image processing apparatus according to claim 7, wherein the image processing apparatus is included.

8. The document image to be subjected to the second compression processing is a document image generated by filling a portion of an area determined as the character area in the document image with a surrounding color. An image processing apparatus according to 1.

The image processing apparatus according to claim 1, further comprising a character recognition processing unit that performs a character recognition process on the region determined to be a character by the region determination unit.

A binarization step for generating a binary document image from a multi-value document image;
A region extracting step of extracting a determination target region from the generated binary image;
A first thinning step of thinning the image of the determination target region by a first thinning process;
A second thinning step of thinning the image of the determination target region by a second thinning process;
A first color dispersion value calculating step of calculating a first color dispersion value of the multi-value document image with respect to a position corresponding to a thinning result obtained by the first thinning step;
A second color dispersion value calculating step of calculating a second color dispersion value of the multi-value document image with respect to a position corresponding to the thinning result by the second thinning step;
An image processing method comprising: an area determination step for determining whether the determination target area is a character or a photo based on the calculated first color dispersion value and second color dispersion value.

A computer program for controlling an image processing apparatus for processing a document image,
A binarization step for generating a binary document image from a multi-value document image;
A region extracting step of extracting a determination target region from the generated binary image;
A first thinning step of thinning the image of the determination target region by a first thinning process;
A second thinning step of thinning the image of the determination target region by a second thinning process;
A first color dispersion value calculating step of calculating a first color dispersion value of the multi-value document image with respect to a position corresponding to a thinning result obtained by the first thinning step;
A second color dispersion value calculating step of calculating a second color dispersion value of the multi-value document image with respect to a position corresponding to the thinning result by the second thinning step;
Program code for causing a computer to execute each step of determining whether the determination target region is a character or a photo based on the calculated first color dispersion value and second color dispersion value A computer program comprising:

A computer-readable storage medium storing the computer program according to claim 13.