Segmentation is a standard OCR procedure. I found a MATLAB script written by Ankit Saroch. His script detects a horizontal empty line to determine the line break, and then a vertical empty row to determine letter boundaries. While this works for English letters, segmenting Bopomofo turned out to be a bit more complicated, because of the spacing between each symbol:
The red line shows the horizontal empty line, incorrectly marking a line break. Actually, the tone marks on the right also causes some boundary problem (not shown).
The solution is simple. We need to use bigger empty regions to detect boundaries.
The following is the modified source code (from Ankit Saroch's original script) that I ran in Octave, which is compatible with MATLAB (note that I didn't need some functions Ankit origially used in MATLAB which are not available in standard Octave packages). The main program is segmentation.m, which reads in the source image, calls line_crop and letter_crop to crop each syllable (i.e. the vertically stacked symbols). max_space is the defined number of consecutive empty lines to form the line/syllable boundary. I also included a modified function clip.m that was useful to leave arbitrary width/height of the boundaries. Some debug/experimentation comments can also be found.
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% % line_crop.m %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% function [fl re]=lines_crop(im_texto) % Divide text in lines % im_texto->input image; fl->first line; re->remain line im_texto=clip(im_texto); max_space = 5; num_filas=size(im_texto,1); for s=1:num_filas if sum(im_texto(s,:))==0 if (s+max_space < num_filas) && (sum(im_texto(s+max_space,:)) == 0) nm=im_texto(1:s-1+max_space, :); % First line matrix %pause(1); rm=im_texto(s+max_space:end, :);% Remain line matrix %pause(1); fl = clip(nm); %pause(1); re=clip(rm); %*-*-*Uncomment lines below to see the result*-*-*-*- %subplot(2,1,1);imshow(fl); %subplot(2,1,2);imshow(re); break endif else fl=im_texto;%Only one line. re=[ ]; end end %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% % letter_crop.m %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% function [fl re space]=letter_crop(im_texto) % Divide letters in lines im_texto=clip(im_texto); num_filas=size(im_texto,2); max_space = 5; for s=1:num_filas s; if sum(im_texto(:,s)) == 0 if (s+max_space < num_filas) && (sum(im_texto(:,s+max_space)) == 0) k = 'true'; nm=im_texto(:,1:s-1+max_space); % First letter matrix %figure,imshow(nm); %title('first letter in the function letter_in_a_line'); %pause(1); rm=im_texto(:,s+max_space:end);% Remaining line matrix %figure,imshow(rm); %title('remaining letters in the function letter_in_a_line'); %pause(1); fl = clip(nm); %pause(1); re=clip(rm); space = size(rm,2)-size(re,2); %*-*-*Uncomment lines below to see the result*-*-*-*- %subplot(2,1,1);imshow(fl); %subplot(2,1,2);imshow(re); break endif else fl=im_texto;%Only one line. re=[ ]; space = 0; end end %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% % clip.m %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% function img_out=clip(img_in) [x y]=size(img_in); [f c]=find(img_in); max_r = max(f) + 0; if max_r > x max_r = x; endif temp_img = img_in(min(f):max_r,min(c):max(c)); [f1 c1]=find(temp_img); img_out=temp_img(min(f1):max(f1),min(c1):max(c1)); endfunction %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% % segmentation.m %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% function result=segmentation(filename) warning off; serial=0; imagen=imread(strcat(filename,'.jpg')); % Show image imagen1 = imagen; % Convert to gray scale if size(imagen,3)==3 %RGB image imagen=rgb2gray(imagen); end % Convert to BW threshold = graythresh(imagen); imagen =~im2bw(imagen,threshold); imagen2 = imagen; re=imagen; while 1 %Fcn 'lines_crop' separate lines in text [fl re]=lines_crop(re); %fl= first line, re= remaining image imgn=fl; n=0; %Uncomment line below to see lines one by one if (size(fl)>0) %figure,imshow(fl);pause(.5) else break; endif %----------------------------------------------------------------- spacevector = []; % to compute the total spaces betweeen % adjacent letter rc = fl; while 1 %Fcn 'letter_crop' separate letters in a line [fc rc space]=letter_crop(rc); %fc = first letter in the line %rc = remaining cropped line %space = space between the letter % cropped and the next letter %uncomment below line to see letters one by one if (size(fc)>0) imwrite(~fc,strcat(filename,"-",num2str(serial),".png")); %figure,imshow(fc);pause(0.5); serial=serial+1; else break; endif end end endfunction %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% % licence.txt %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% % Copyright (c) 2012, Tsan-Kuang Lee % Copyright (c) 2011, Ankit Saroch % Copyright (c) 2006, Diego Orlando Barragan Guerrero % All rights reserved. % % Redistribution and use in source and binary forms, with or without % modification, are permitted provided that the following conditions are % met: % % * Redistributions of source code must retain the above copyright % notice, this list of conditions and the following disclaimer. % * Redistributions in binary form must reproduce the above copyright % notice, this list of conditions and the following disclaimer in % the documentation and/or other materials provided with the distribution % * Neither the name of the EQBYTE Instruments Cía. Ltda. nor the names % of its contributors may be used to endorse or promote products derived % from this software without specific prior written permission. % % THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" % AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE % IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE % ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT OWNER OR CONTRIBUTORS BE % LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR % CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF % SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS % INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN % CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) % ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE % POSSIBILITY OF SUCH DAMAGE.
I guess I am the only one who comes here to share my very own experience guess what? I am using my laptop for almost the post 2 years.
ReplyDeleteTotalSpaces Crack
Ample Guitar VST Crack
SmartDraw Crack
ApowerManager Crack