pathak.ravi1989
BAN USERUse cache with augmented splay trees such that the element if goes beyond x height will be removed from the tree
Here x can be the number of elements the client want to access fastly
Advantage
1. it has better average run time for insertion removal and search amortized 0(log n)
2. Benefits with less memory usage
Break the 1TB file into number of cores system has, eg if system has 4 cores than 250 GB each file block size
Now make a thread which will process each file block
Within the thread
Use ArrayList
while file encounters character either '{' or '}'
if last index value is '{' and current value is '}; then remove both of them from list
else add the value to list;
return list;
Once these threads have runned on the different files they will produce Arraylists equal to number of cores say eg 4
Then check the ArrayLists
if all arraylists are empty : then print balanced
else
for i=0 to arraylist.size
check for pairs of bracket: if found remove the index from list
if arraylist is empty now then : print balanced
else
print unbalanced;
1st doubt: 1TB itself is a huge size and assuming a memory rich server is reasonable for it.
- pathak.ravi1989 January 23, 20142nd doubt: Huge servers have huge RAM space and thus big compiler memory heaps. Although taking your point on unlucky sequence: we can prefer compressing data.
an elegant solution will be
Long countx=0,county=0;filesegment=index;//filesegment defines which section of file eg 4 four cores we have 4 indexes namely {0,1,2,3}
while !EOF
if current value is '} then
{
if countx>0 // number of '{' are more then
countx--;
else if (filesegment==0 && count==0)// 1st index file cannot have '}' in absence of '{'
print unbalanced; break; System.exit();
else if filesegment!=0// else we will add in count '}'
county++;
}
if last index value is '{' and current value is '{' then countx++;
end while
return (countx+1,county);//countx+1 because when countx=0 it will have one '{' because of if statement
After thread completes tasking we join the output from all the threads
O1=sum(countx1,countx2,...)//taking sum of all countxs
O2=sum(county1,county2,....)//taking sum of all countys
for each filesegments
if i=filesegments.length-1 then
if(O1>O2)// one before last file index we have more '{' and less '}' then unbalanced
print unbalanced;BREAK;System.exit();
if((countx(i)>=county(i+1))&& i!=filesegments.length-1)
O1=O1-county(i+1); O2=O2-county(i+1);
if((countx(i)<county(i+1))&&i!=filesegments.length-1)// unbalanced paranthesis
print unbalanced;BREAK;System.exit();
end for
if O1==O2==0 print balanced
This is fast compared to the above one as if these unlikely sequences follows illegal constructs can be handled fast. No need of having bulky datastuctures