Monday, November 19, 2012

Segmentation of Bopomofo Symbols

A friend of mine is working on a project in which he needs to display Bopomofo (Chinese phonetic symbols used in Taiwan). He decided to display, in his program, each syllable as one unit that consists of vertically stacked phonetic symbol. To generate those well positioned symbols, he used MS Word to type all syllables he needs, captured the screenshot as images, and asked me to segment them. He gave me 26 files in total. Each files has several lines of something like this:

Segmentation is a standard OCR procedure. I found a MATLAB script written by Ankit Saroch. His script detects a horizontal empty line to determine the line break, and then a vertical empty row to determine letter boundaries. While this works for English letters, segmenting Bopomofo turned out to be a bit more complicated, because of the spacing between each symbol:
The red line shows the horizontal empty line, incorrectly marking a line break. Actually, the tone marks on the right also causes some boundary problem (not shown).

The solution is simple. We need to use bigger empty regions to detect boundaries.

The following is the modified source code (from Ankit Saroch's original script) that I ran in Octave, which is compatible with MATLAB (note that I didn't need some functions Ankit origially used in MATLAB which are not available in standard Octave packages). The main program is segmentation.m, which reads in the source image, calls line_crop and letter_crop to crop each syllable (i.e. the vertically stacked symbols). max_space is the defined number of consecutive empty lines to form the line/syllable boundary. I also included a modified function clip.m that was useful to leave arbitrary width/height of the boundaries. Some debug/experimentation comments can also be found.

%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
% line_crop.m
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
function [fl re]=lines_crop(im_texto)
% Divide text in lines
% im_texto->input image; fl->first line; re->remain line
im_texto=clip(im_texto);

max_space = 5;
num_filas=size(im_texto,1);
for s=1:num_filas
    if sum(im_texto(s,:))==0
  if (s+max_space < num_filas) && (sum(im_texto(s+max_space,:)) == 0)
   nm=im_texto(1:s-1+max_space, :); % First line matrix
   %pause(1);
   rm=im_texto(s+max_space:end, :);% Remain line matrix
   %pause(1);
   fl = clip(nm);
   %pause(1);
   re=clip(rm);
   %*-*-*Uncomment lines below to see the result*-*-*-*-
       %subplot(2,1,1);imshow(fl);
       %subplot(2,1,2);imshow(re);
   break
  endif
    else
        fl=im_texto;%Only one line.
        re=[ ];
    end
end

%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
% letter_crop.m
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
function [fl re space]=letter_crop(im_texto)
% Divide letters in lines

im_texto=clip(im_texto);
num_filas=size(im_texto,2);

max_space = 5;
for s=1:num_filas
    s;
    if sum(im_texto(:,s)) == 0
  if (s+max_space < num_filas) && (sum(im_texto(:,s+max_space)) == 0)
   k = 'true';
   nm=im_texto(:,1:s-1+max_space); % First letter matrix
   %figure,imshow(nm);
   %title('first letter in the function letter_in_a_line');
   %pause(1);
   rm=im_texto(:,s+max_space:end);% Remaining line matrix
   %figure,imshow(rm);
   %title('remaining letters in the function letter_in_a_line');
   %pause(1);
   fl = clip(nm);
   %pause(1);
   re=clip(rm);
   space = size(rm,2)-size(re,2);
   %*-*-*Uncomment lines below to see the result*-*-*-*-
       %subplot(2,1,1);imshow(fl);
       %subplot(2,1,2);imshow(re);
     break
  endif
    else
        fl=im_texto;%Only one line.
        re=[ ];
        space = 0;
    end
end


%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
% clip.m
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
function img_out=clip(img_in)
    [x y]=size(img_in);
    [f c]=find(img_in);
 
    max_r = max(f) + 0;
    if max_r > x
        max_r = x;
    endif

 temp_img = img_in(min(f):max_r,min(c):max(c));
    [f1 c1]=find(temp_img);
    img_out=temp_img(min(f1):max(f1),min(c1):max(c1));
endfunction

%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
% segmentation.m
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
function result=segmentation(filename)
 warning off;
 serial=0;
 imagen=imread(strcat(filename,'.jpg'));
 % Show image
 imagen1 = imagen;
 % Convert to gray scale
 if size(imagen,3)==3 %RGB image
  imagen=rgb2gray(imagen);
 end
 % Convert to BW
 threshold = graythresh(imagen);

 imagen =~im2bw(imagen,threshold);
 imagen2 = imagen;
 re=imagen;
 while 1
  %Fcn 'lines_crop' separate lines in text
  [fl re]=lines_crop(re); %fl= first line, re= remaining image
  imgn=fl;
  n=0;
  %Uncomment line below to see lines one by one
  if (size(fl)>0)
   %figure,imshow(fl);pause(.5)
  else
   break;
  endif
  %-----------------------------------------------------------------
  
  spacevector = [];      % to compute the total spaces betweeen
          % adjacent letter
  rc = fl;              
    
  while 1
   %Fcn 'letter_crop' separate letters in a line
     [fc rc space]=letter_crop(rc);  %fc =  first letter in the line
             %rc =  remaining cropped line
             %space = space between the letter
             %   cropped and the next letter
     %uncomment below line to see letters one by one
   if (size(fc)>0)
    imwrite(~fc,strcat(filename,"-",num2str(serial),".png"));
    %figure,imshow(fc);pause(0.5);
    serial=serial+1;
   else
    break;
   endif
  end
  
 end
endfunction

%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
% licence.txt
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
% Copyright (c) 2012, Tsan-Kuang Lee
% Copyright (c) 2011, Ankit Saroch
% Copyright (c) 2006, Diego Orlando Barragan Guerrero
% All rights reserved.
% 
% Redistribution and use in source and binary forms, with or without 
% modification, are permitted provided that the following conditions are 
% met:
% 
%     * Redistributions of source code must retain the above copyright 
%       notice, this list of conditions and the following disclaimer.
%     * Redistributions in binary form must reproduce the above copyright 
%       notice, this list of conditions and the following disclaimer in 
%       the documentation and/or other materials provided with the distribution
%     * Neither the name of the EQBYTE Instruments Cía. Ltda. nor the names 
%       of its contributors may be used to endorse or promote products derived 
%       from this software without specific prior written permission.
%       
% THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" 
% AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE 
% IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE 
% ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT OWNER OR CONTRIBUTORS BE 
% LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR 
% CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF 
% SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS 
% INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN 
% CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) 
% ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE 
% POSSIBILITY OF SUCH DAMAGE.




No comments:

Post a Comment