Title of Invention | "A TRANSLATION SYSTEM FROM A FIRST LANGUAGE TO A SECOND LANGUAGE" |
---|---|
Abstract | The present invention relates to a translation system from a first language to second language comprising: the pre-translation engine for separating symbols, decimal, integers, abbreviations hereinafter referred to as a special words and numerical and roman numbers from the input text; the one output of said pre-translation engine is connected to a mam translation engine for identification of sentences, identification of phrases and identification of words, thereafter checking the said sentences with linguistic properties and semantic analysis from the multiple dictionaries in at least one of Data Bank means; the second output of said pre-translation engine for separating symbols, decimal integers, abbreviations, numerical and roman numbers, is connected to one input of post translation engine for assembly in the final output in the second language; the output of the main translation engine is connected to another input of the post translation engine for final assembly to produce the desired output in the special second language; and if desired, the final assembly of the output is re-passed for self-checking through the special Data Bank means for further accuracy. |
Full Text | I his invention relates to a language translation system, more parucumnv i-- -A language translation system, which can handle languages of widely different grammatical structure like English and Hindi and translate one into the other with a very high level of accuracy covering over a variety of general and professional domains and associated language usage. Further, this invention can handle words and phrases with due regard / reference to a comprehensive set of lexical properties, which ensures output to the target language with a very high level of semantic and grammatical accuracy. There is also an extensive 'rule rich' library, that ensures high level of translation accuracy of both meaning and intent. BACKGROUND There are three known methods by which one can translate from one language to the other language, namely, direct method, transfer method and interlingua method. . Direct method: The direct method translates word by word without interpreting sentence structure. The result is a translation that is in the word order of the source text, which can result in a significance loss of meaning and readability. Accordingly, the quality of translation is not acceptable. Transfer method: The transfer method perform semantic analysis of the source language sentences and transfers that information into the target language based on a set of rules specific to that language direction. The quality of the resulting translations is significantly higher than the simple direct method translations. However, in this method the development of language pairs is time consuming and expensive Interlingua Method: The interlingua would input the source language into an intermediate repository where expressions, sentence semantic and forms of expressions would be replaced, independent of any language and then the intermediate language to meta-language, would be translated into the target language. This theoretical method has yet to be used with any commercial applications. The British Computer Society, Natural Language translation Specialist Group has extensively documented translation systems available worldwide. It is significant that no translation system has been listed that can translate English to Hindi or vice versa. In addition literature refers to a variety of other systems like semantic grammar, which have been shown to be effective in providing accurate translation for limited domains. However, they are usually hard to expand to cover new domains. Modular grammar has been used to overcome the problems associated with expanding semantic grammars to new domains. Each sub-grammar covers the dialogue acts required for one sub domain. An additional grammar provides cross domain dialogue acts such as common openings and closings. All grammars share one library with common concepts, such as time expressions and proper names. At present, studies and developments of machine or computer translation system for translating Japanese into English sentences or reversely English into Japanese are actively in progress and has been described in US patent number 4,980,829. However, none ot these approaches have been able to develop a commerciallv viable machine translation system that can handle languages with disparate grammatical structure like English and Hindi to a level of efficiency that can be accepted in profession transactions. The translation systems can be classified into four categories mentioned below in dependence on the subject matter on the objects for translation: • a so-called electronic-dictionary based translation in which the translation is performed with reference to a dictionary on a word basis • a translation which is performed on the basis of template sentences stored in form of sentences. The translation performed by replacing some words of the template sentence by other word(s) falls within this category • a translation relying on the parsing is a complicated sentence of given patterns • a translation directed not only to the sentence but also to a text as a whole i.e. a meaningful set of sentences The object of this invention particularly, relates to the fourth category where a text say in English is translated to Hindi and vice versa by conducting in-depth analysis of the text, for instance, part of speech by using word dictionary and phrase dictionary, translation of compound and complex sentences besides simple sentences, transliteration and refine translation for multiple domains and self-correction to produce accurate and efficient translation. Further object of this invention is to apply multiple translation programs simultaneously and make a translation by composing the best parts from the various outputs. Yet another object of this invention is to include knowledge based, statistical and direct dictionary based approaches, namely, knowledge based system, statistical dialogue act assignment and glossary look-ups. To achieve the said objective, this invention provides a translation system from a first language to second language comprising: the pre-translation engine for separating symbols, decimal, integers, abbreviations hereinafter referred to as a special words and numerical and roman numbers from the input text; the one output of said pre-translation engine is connected to a main translation engine for identification of sentences, identification of phrases and identification of words, thereafter checking the said sentences with linguistic properties and semantic analysis from the multiple dictionaries in at least one of Data Bank means; the second output of said pre-translation engine for separating symbols, decimal integers, abbreviations, numerical and roman numbers, is connected to one input of post translation engine for assembly in the final output in the second language; the output of the main translation engine is connected to another input of the post translation engine for final assembly to produce the desired output in the special second language; and if desired, the final assembly of the output is re-passed for self-checking through the special Data Bank means for further accuracy. 1 he pre-translation engine comprises means for checking symbols, the output of said means is connected to the means for checking decimal, integers, abbreviations and means for checking numerical numbers and roman numbers. the arrangement between the said means is such that when the sentence is passed through the pre-translation engine, the symbol abbreviation, decimal, integer, numerical numbers and roman number are separated. The main translation engine comprises means for identification of sentences, the said means is connected to the means for identification of phrases and means for identification of words. The means for identification of phrases is connected to multiple Data Bank of different domains for checking whether the phrase belongs to Agriculture, Science & Technology and legal Banking. The means for identification of words is connected to multiple Data Bank of multiple domains for identification of words, and the different domains upto three can be handled by the said multiple data bank at a time. The output of the multiple Data Bank is connected to transliteration means if meaning of the word is not found, the output of the transliteration means is connected to a refined translation means for selecting the meaning of the words. The output of the multiple Data Bank is connected to identification of grammar property meaning of the word, if the word meaning is found, the said identification of grammar property of the word means is connected to refined translation means for selecting the appropriate word. The output of the said refined translation means and the phrases means is connected to identification means to analyze the type of sentences namely; imperative, interrogative and negative sentences. The means to create templates in the first language and their output in desired second language is provided. The output of said identification means is connected to means for checking complex and compound sentences based on number of conductions, the output of said checking means for complex and compound sentences is further connected to means for checking relative, adjectival and adverbial classes, if simple or complex compound sentences are found, the simple sentence is extracted from the said complex sentences and are passed on along with other simple sentences to means to decide the tense of the sentences using means for processing of verb depending on the tense of the sentences, the output of the means for processing of verb is finally connected to means for final formatting of the sentences depending upon subject, object and putting them in right places. The post translation engine consists of means for assembling and re-arranging sentences in Hindi sentence structure for final output. The said dictionaries include general dictionary, agricultural dictionary, technical and scientific dictionaries and Banking dictionary with classification of linguistic rule and final analyze of part of speech and their various forms. The linguistic rules are 2000-2003 as herein described. A computing system wherein the above translation system is included. An ASIC includes a translation system This invention will now be described with reference to the accompanying drawings: Fig 1 shows the translating system, according to this invention. Fig 2 shows the pre-translation means, according to this invention. Fig 3 shows the main translation means, according to this invention. Fig 4 shows the post-translation means, according to this invention. Referring to the drawings, item (5) shows the pre-translation engine, item (6) shows the main translation engine and item (7) shows the post-translation engine. The details and the functions of the pre-translation engine, main translation engine and the post-translation engine are given below. a) Pre Translation Engine (5) The Pre Translation Engine (5) consists of means (5.1) for symbols checking which is connected to means (5.2) for checking decimal integers, abbreviations, double quotes and means (5.3) for checking roman and numerical numbers. In the pre translation engine (5) a routine check namely, the presence of decimal integers (numbers) is carries out in (52) means, so that these numbers do not cause any confusion due to the presence of'fullstops' (.) in the string. After this, the string is checked for abbreviations, double quotes like, "Rs.", "Mr.", "Ms.", "Dr."etc Changing of $ symbol to "Dollar" is carried out in means (5.1) The pre translation engine checks these symbols, abbreviations, decimal integers, roman and numerical numbers and separates them and send them for final assembly to post translation means. b) Main Translation Engine : (6) The main translation engine (6) mainly consists of identification of the sentence means (6.3) connected to identification of phrase means (6.2) and identification of word means (6.3). The identification of phrase means (6.2) is connected to a dictionary of data banks (DB1), consisting of agricultural dictionary, phrase dictionary, scientific and technical phrase dictionary, banking phrase dictionary, dictionary of other domains and general phrase dictionary. The data banks used in the instant invention are multiple data banks. The output of data bank passes through general phase dictionary, if a phrase is found then it is passes through identification of phrase means (6.2.1). The identification of words means (6.3) is connected to a data bank (DB2), which consists of agricultural dictionary for words, scientific and technical dictionary for words, banking dictionary for words, dictionary of words of domains, and the general dictionary of words. The output of the databank (DB2) is connected to identification of grammar property meaning of the word means (6.3.2), if the meaning of the word is found. If the meaning of the word is not found, then the data bank (DB2) is connected to transliteration means (6.3.1). The output of transliteration means is connected to refine translation means (2) for selection of appropriate word. The output of identification of grammar of the word means (6.3.2) is also connected to the refine translation means for selecting the meaningful word. Thereafter, the output of identification of phrase means and the output of the refine translation means (2) and the output of identification of grammar of the words (6.3.2) is connected to identification means (6.4) for type of sentence analysis namely, imperative, interrogative and negative sentences. The said means for identification of type of sentences (6.4) is connected to the means (6.5) for checking complex and compound sentences based on conjunctions and thereafter it is connected to the means for checking relative, adjectival, adverbial clauses, if simple sentences and the complex sentence are found, the simple sentences are extracted from the complex sentences and are passed on to means to decide tenses, along with the simple sentences. The said means to decide tense is further connected to means for processing of verb depending upon the tense of the sentence and finally the said processing means is connected to final formatting means for formatting of sentences and formatting subject, object and putting them in the right place. In the main translation engine, the translation is based on the comparative aspect of English grammar rules to Hindi grammar rules. There are almost 2000-3000 rules implemented in this application. The string is checked for punctuation in means (6.1), of there is no punctuation, a full stop is appended to the string. Depending upon the number of punctuation marks, the number of sentences present in the string are determined. Then each sentence from the translation string is stored in an array of strings one by one in a stack. After separating each sentence, these sentences are checked for abbreviation. Each character in the sentence is thereafter checked. A character is either an alphabetic or a numerical character. This is determined on the basis of the semantic analysis of the language. The sentence is then checked for missing articles, verbs plural forms of compound sentences when subordinate class has not auxiliary verbs or main verbs, or main verb auxiliary verb and main verb is decided using main clause, for eg. Ram is going to Delhi and Shyam to Jaipur. Then carriage returns and new line of characters is the sentences are replaced bt space. Along with this, apostrophes, such as 'would've', or 'O'clock' are replaced by 'would have' and 'O clock'. After this each word is separated from the sentence according to the space in the sentence and stored in an array of structure in a stack. Each sentence is checked for the presence of phrases in (6.2) and (6.2.1) if the phrases are found, then its properties and Hindi meaning are picked from the data bank (DB1) and a flag is set for that particular phrase in the given sentence so that it values or meaning are not checked again in future. The property of phrase can be a noun or a verb or an adjective or a preposition. After checking for phrases in (63.1) each word of the sentence is checked for their Hindi meaning and property from the databank (DB2). If a word is found then the properties and related Hindi meaning are stored in the structure in the same position as the English word itself The property of any word can be a noun or a pronoun or a verb or an adjective or adverb or a preposition or conjunction. If a word is a noun then the property is further divided in sub-category like masculine noun and feminine noun and place noun and event noun. If a word is a verb then the property is further divided into verb transitive and verb intransitive and "ing" form and second form of the verb and third form of the verb and plural form of the verb. If the word is not found it is then sent for the Transliteration Process (6.3.1) (discussed below). If the word is found to be a verb, verb flag is set to decide the tense of the sentence. If a sentence is not a complete phrase, the helping verb and the main verb of the sentence is decided after there is a check for the presence of adverb between the helping verb and the main verb in (6.4). There is a check for imperative sentences in (6.4) to see whether the sentence starts with a verb. If the sentence is interrogative, this is decided by the presence of "?" there is a check for the question word like "who", "which", "what" etc The meaning of that particular question word is decided. The interrogative sentence is then changed to an assertive sentence. There is a check for gerund form of the sentence in (6.5) and (6.7). After this there is a check to see if the sentence is a complex, compound or a simple sentence in (6.5) If the sentence is complex or compound, the simple sentences are extracted in (6.6) by breaking them into simple sentences, this is decided by the presence of conjunction. Infinitive verbs are checked in the sentence and converted into nouns in (6.7). Then the negative sentences are checked based on the presence of the word "not" and converted to assertive sentences. The position of main verb is found, based on the position of the verb flag, along with this the position of the subject and objects are decided upon. After the position of the verb is found there is a decision on the tense of the verb. Tense is decided by the form of the verb present, If the verb is in the first form then it's tense may be present or future. The tense of the sentence is finally decided by the combination of auxiliary verb and main verb. eg. :- main form :- "going" If the auxiliary verb:- "is" Then tense will be "present continuous" If the auxiliary verb:- "was" Then tense will be:- "past continuous" Based on the tense, the verb is processed in (6.8). According to the position oi the verb, the subject and object are processed in (6.9) and then formatted and assembled together to make a complete sentence in (6.10). c) Post Translation Engine (7) After formatting in (6.10), the sentence sent for post translation in (7.1). In post translation, formatted Hindi Sentence is checked for wrong Hindi words. If found correct, the user gets the correct Hindi sentence on a separate new window. 2. Refined Translation: (2) The Main Procedures remain the same, the only special quality of this process is that when the sentence is being passed through the Refined Translation Engine, it provides the user a range of multiple Hindi meanings of English word(s), so that the user can select the appropriate Hindi meanings, which are in relation to his purpose. Other Features: Selected Text(s)/ Batch Translation Processing: (3) If one does not want to get the complete typed text translated in Hindi but only a small portion of it then it is done through the selected text which is copied on to the clipboard from where it is extracted in response to the user selecting 'selection translation' into a global function. This is then sent to the General Translation Engine of the application and the procedure followed hereon is same as that of the General Translation. Letter Translation: This feature provides the user for designing customised letter, which can also be saved for later usage. It also has the facility to mail merge Annexure Translation: Annexures can be translated by the use of this feature Annexures can be scanned & then it can be translated. File Import/Export: Letters of the formats like .rtf, .doc, .txt, ASCII files can easily be imported to the English to Hindi Translator. Other clip Arts/Pictures can also be imported through Insert Clip Art1 facility on the Translator. Extra Hindi Fonts: More then 100 Hindi fonts with international keyboard standard has been developed for the Translation engine, so that the user can select the appropriate one according to his/her choice. Scanner/OCR Support: Any text or document can be scanned & then converted text can easily be directed to Translation Engine for Translation. Character Map: A character map is attached with the Translator for the user to ease out Hindi typing. English & Hindi Spell Check: The user can check the English spelling before the translation. If the user adds any words/sentences in Hindi that can be checked through 'Hindi spellcheck1. Transliteration Option / Process (4) If the user selects the transliteration option , then the function calls searches for the words, which are not found in the dictionary of general words After checking this condition it goes into the "Not Found" section of the program where it searches for the word in the table meant for transliteration. By saving Table of Transliteration1 means that there is a table at the backend containing some common names of peoples and Company's/organizations. If all the words are not found in the Table of Transliteration' also then it goes in for actual transliteration. Actual transliteration is the process of creating a Hindi word after reading the letters of a particular word going by certain rules defined, e.g. if the document typed contains a list of names of few persons and some company names like. Mrityunjay Jha Ram Naresh Yadav Laxman Filial Vimal Kumar Harjeet Singh Ahluwalia State Bank of India ANZ Grindlays Bank & So on... The output would be like as follows... Ek R;qat; >k jke ujs'k ;kno y{e.k fiYyS foey dqekj gjthr flag vkgywokfy;k LVsV cSad vkWQ bafM;k ,n,untSM fxzUMyst+ cSad This output will come only after it finds all the words in the Table of Transliteration'. Let us say, the name 'Vimal Kumar1 is not found in the transliteration table then according to the rules defined in the Actual Transliteration the output would be foey dqekj. This is because the term 'Vi' anywhere in the word would be written as To' in Hindi and same for 'ma* would be written as 'e' in Hindi. Dictionary The translator (Anuvaadaak) at present consists of four dictionaries namely,.. General Dictionary Agricultural Dictionary Technical & Scientific Dictionary Banking General Dictionary allows the user to search for the meanings of different words. This has further sub-options as... Look for a Single Word Look for a Single Phrase Look for all the words of the dictionary Look for all the phrases of the dictionary Besides these options, user can dynamically add his new words both in English as well as Hindi and phrases to the dictionary. Similarly, user can modify the pre-existing words and can add more meanings to a single English word. The same also applies with the phrases also i.e. the user can add and modify the phrases as well. There are 80,000 single and double words (Phrases) present in the Dictionary. Look for a Single Word: This section allows the user to look for a meaning of a single word. The screen interface shows a small edit box where user can type the English Word and after pressing the display button or Enter Key, displays the meaning of the specific word. If the spellings of the word that the user has typed, are correct then its meanings are displayed at the right side, otherwise an error message is generated saying "Please check the word again, you entered. Either the spellings are wrong or the word you typed is not present in the dictionary". Look for a Single Phrase: This section allows the user to look for the meaning of a single phrase. The screen interface shows an edit box where user can type the English Phrase and after pressing the display button or Enter Key, displays the meaning of the typed phrase. If the spellings and the phrase typed is correct, then its meaning is displayed at the right hand side, otherwise an error message is displayed saying "Please check the phrase you entered again. Either the spellings are wrong or the phrase you typed is not present in the dictionary. Look for all words of the dictionary: This section allows the user to see all the words of present in the dictionary. The screen interface shows a list box, an edit box and a static box for displaying the meaning of the selected word. The user can select a word from the list box and on clicking display button or pressing Enter Key displays the meanings at the right hand side. Look for all the phrases of dictionary: This section allows the user to see all the phrases present in the dictionary. The screen interface is same as that of 'Look for all words of dictionary1. Add New Word: This option allows the user to add new words dynamically to the dictionary with its properties. The screen interface displays number of edit boxes and check boxes with three buttons namely 'Add', 'Charmap'/Cancel1. The user can type the new word in the specified edit box and can select its corresponding property i.e. whether the word entered is a noun, verb, adverb, adjective and so on. Similarly, user can type the Hindi meaning for the word. Finally, pressing the 'Add' button adds the word along with its meaning and properties to the dictionary. Before adding a new word, the program checks for the pre-existing word. If the word entered by the user already exists in the dictionary, an error message is flashed saying 'Word entered already exists. Please enter a different word'. Add New Phrase: This option allows the user to add new phrases dynamically to the dictionary with its properties. The screen interface shows two edit boxes two check boxes and buttons of'Add', 'Charmap', 'Cancel'. The user can type in the phrase and its corresponding Hindi meaning with its property and clicking on Add button adds a new phrase to the dictionary. Before adding a new phrase, its first checks for the phrase added by the user in the dictionary to make sure that the phrase that is being added by the user should not be already present. I f the phrase is already present then an error message is flashed saying 'Phrase already exists, Please enter a different phrase'. Agricultural Dictionary: The screen interface and functionality is same as that of the General Dictionary except that it displays the words related to Agriculture, Botanical names and plant diseases. There are 9000 single and double words (Phrases) present in the Dictionary. Technical & Scientific Dictionary: The screen interface and functionality is same as that of the General Dictionary except that it displays the words related to Science and Technology inclusive of (Physics, Chemistry & Atomic Science). There are around 50,000 single and double words (Phrases) present in the Dictionary. Banking Dictionary: It displays the Banking words related to Accounts, Finance and Reserve Bank of India Glossary. There are around 10,000 words and double words (Phrases) related to banking in the dictionary. Linguistics Rules are given below: 4 EngSentBreak(char chl,char ch2): Function checks for hyphens and breaks the sentences containing hyphens with spaces at the end of a word or a line. * RomanHandling(CString romansent); Function handles the occurrences of Roman numbers in the sentences. Roman numbers can be in the middle of the sentences or in the starting of the sentences. « EngABC 1 (CString strwe,CString strwh); Function checks for occurrences of alphabetical numbering like (a),(b),(c) and so on at the starting of the sentences * EngABC(EngSent Sentlme[],int wordcount); Does the same job, but also checks whether the alphabet is in single bracket or in double brackets. * ProcessingofSub(int TenseFlag,int VerbPos,int SubjectPos,EngSent Sentline[],CString Sub,int wordcount); Does the processing for the word depending on its properties, i.e. if the property of the given word in Verb-transitive or Verb-intransitive then accordingly and adds "Ne" in Hindi before the Subject. * ProcessingofVerbtran(CString wordh,int VerbPos,EngSent Sentline[],int wordcount,int TenseFlag); Function is used for the processing of the Verb-transitive. * GetRightHofOF(CString str,CString strl,CString str3,CString str4,CString str5); GetHindiofThis(CString str,CString strl); Function adds the right Hindi meaning of'OF' which is decided by the word placed before 'OF'. * GetRightHindiofPossPro(CString ssube,CString sworde,CString swordh); Function fetches the correct Hindi for possessive pronouns. * GetRightHindi(EngSent &Sentline,EngSent &Sentlmel ,EngSent &Sentline2,int num,int numl,int VerbFlagQ); Function fetches the correct Hindi meaning depending on the property of the given word as the one word in English can be used in different contexts, e.g. > India is a land of villages. > The airplane has landed at the airport. In the first sentence the word 'land' is treated as a noun while in case of second sentence the same word is treated as a verb. * GetRightHindiPlural(EngSent Sentline[],int wordcount); Used to get the correct Hindi of plural words. * GetRightProp(EngSent &Sentline 1 ,EngSent &Sentline); Used to check for the correct property of the given word. * EngCheckNumberType(CString str); Function checks for numbers present in the sentences. * HindiToEnglish(CString Input); Function internally calls many functions such as EngGetWordPropQ, EngPhrase(),EngForma1:Output(),EngTranslate() and so on. This is from one of the main functions used in the process of translation. * DisplayHindiSpellCheckO: Displays the Hindi Spell Check Window to the user * DisplayEngSpellCheckQ; Displays the English Spell Check Window to the user. * EngIsVowel(CString hindinoun); Checking for vowels in Hindi. * EngConsonent(CString Verb); Checks for the ending of the word, i.e. whether the word is ending in consonant or a vowel. * EngTransVerb(CString Verb, int TensFlag); Processing of the verb according to the tense. 4 GetRightHindiofPro(CString sworde,CString swordh); Used to get the correct Hindi meaning for a pronoun word depending on the structure of the sentence. * GetHindiWords(CString &strHindi,CString strEng); Extracts the single meaning from the list multiple meanings given in the dictionary and stores it in a variable. * MakeSentProp(EngSent Sentline[],int VerbFlag[],int wordcount); Used mainly for the second form of verb and decides whether the word should be treated as an adjective or as a verb. * EngVerb II III(EngSent Sentlinef], int t, int VerbFlag[],int SubjectPos); Deciding the tense used in the sentence structure and also takes cares of second and third form of verb. * BngInIng(EngSent Sentline[], int i, int VerbFlag[],int SubjectPos); I Ised for handling the 'ing' form of the verb. * EngVerb_ING(EngSent Sentline[], int i, int VerbFlag[],int SubjectPos); Used for deciding the tense of the sentence i.e. Present Continues, Past Continues, Future Continues or others. * EngVerbJII(EngSent Sentline[], int i, int VerbFlag[],int SubjectPos); Used for handling the third form of verb. * EngSubjectPalural(int nounplural,EngSent SentWord); Function checks for the plural of the subject in the sentence and accordingly decides the Gender and Number * EngVerb_I_II_IH(EngSent Sentline[], int i, int VerbFlag[],int SubjectPos); Used for Tense Handling and three forms of verb. * EngDecideMainVerb(EngSent Sentlinef], int VerbFlagfJ, int wordcount); Function decides for the main verb from the sentence i.e. if more than one verb is being used in the sentence. * EngCheckRelClause(CString word ,int VerbFlag); Checks for the Relative Clauses present in the sentences. * EngAuxilaryVerb(CString word); Checks for Auxiliary Verb in the sentence. * EnglntroToNeg(EngSent Sentline[], int index,int VerbFlag[]); Used for making interrogative sentences to negative sentences e.g. Do I not go-> I do not go. EnglishHelpingVerb(CString Hverb); Handles the helping verbs present in the sentences. engPostTranslate(CString HindiOutSent); This function is the third phase of the translation i.e. first the preTranslation runs then MainTranslation and finally PostTranslation. Here it checks the Hindi output internally and corrects any of the spelling mistakes or any conversion that is to be done before giving the final output. EngObjectPronoun(CString word); Checks for the presence of pronoun in the object of the sentence. EngDeterminers(CString word); Used for checking the determiners. EngPossessivePron(CString Pronoun); Checking the possessive form of the pronoun. EngArticle(CString Article); Handling of articles in the sentences. EngProcessSub(CString Subject); Processing of the subject is done here. HngProcessVerb(CString Verb, CString &Hverb, int TenseFlag,EngSent Sentlme[],int VerbPos,int wordcount); Used for the processing of the verb. EngProcessHelpVerbfCString Subject, CString Hverb); Processes the helping verb. EngPassive(CString Verb,CString &Hverb,int TenseFlag,EngSent Sentline[],int VerbPos,int wordcount); Checks for the passive sentences. EngChangePronoun(CString Subject); Changes the meaning of pronoun in different contexts. EnglishTenseHandling(EngSent Sentline[], int wordcount,int VerbFlag[],int SubjectPos); Tense handling of sentences. EngFindPosition_of_Verb(EngSent Sentline[], int VerbFlag[], int wordcount, int &SubjectPos, int &VerbPos, int &ObjectPos, int &HverbPos); Checks for the position of the main verb in the sentence. EngMakeAssertive(EngSent Sentlinef], int &wordcount, int VerbFlag[J); Changes the negative sentences to assertive sentences. EngVerbInfmitive(EngSent SentLine[], int VerbFlag[],int &wordcount); Checks for the Infinitive verb and changes it into a noun. HngCheckComplex(int VerbFlagf], int &wordcount, int Conj[ ], int index, EngSent Sentline[j); Checks for the complex sentences. * EngRelativePronoun(CString Word); Checks for relative pronouns. * EngRemoveNA_verb(CString &Verb); Removes the "Na" of Hindi from the meaning of words having property as verb. * EngIntroToAss(EngSent Sentlinef], int VerbFlag[], int &wordcount); Changes the Interrogative sentences to Assertive sentences. * EngHelpAndMain(CString word); Checks for words which as used as helping verb as well as the main verb. * EngWhWord(CString word); Processing of words with "wh" family i.e. What, Where, Why etc. * EngGetHindiMeaning(EngSent &Sentline,EngSent &Sentline 1 ,EngSent &Sentline2,int count,int VerbFlag[],CString Input); Retrieves the Hindi meaning from a particular word. * EngMatchWord(CString worde,CString Verb,CString Verbll, CString Verblll, CString special,CString plural,CString noun); Matches for the exact form of the word, i.e. whether the word is being used as first form of verb or second or third form. * EngPhraseMatch(CString englishphrase, int &wordcount, EngSent Sentline[],int index); Matches the phrases from the phrase table in the dictionary. ingRemoveSpaces(CString Output): Removes the unwanted spaces from the sentences. EngFormatOutput(EngSent Sentline[], int VerbPos, SubjectPos, int ObjectPos, int wordcount,int VerbFlag[],int TenseFlag,char Punc); Formats the output finally. Formatting is done on the basis of subject, object. EngTranslateFunction(EngSent Sentline[], int VerbFlag[], int &wordcount, char Punc,CString Input,int &VerbPos,int &SubjectPos,int &ObjectPos,int &HverbPos,int phrase,CString &conjword,int index); Calls many other functions internally e.g. EngProcessingVerb, EngTenseHandling etc. EngGetSimp_From_Complex(EngSent Sentline[], EngSent simpline[],int VerbFlag[], int &start, int conjf], int wcount, int &wordcount,int &index); Extracts the simple sentences from the complex ones. EngGerund(EngSent Sentline[], int VerbFlag[], int &wordcount); Checks for the Gerund forms of verbs and converts them to noun. Englnterrogative(EngSent Sentline[], int VerbFlag[], int &wordcount); Checking of the interrogative sentences. Englmperative(EngSent Sentline[), int VerbFlagf], int &wordcount,CStnng Punc); Checking of imperative sentences. EngAdverb(EngSent Sentline[], int VerbFlag[], int wordcount); Handling of adverbs between the main verb and the helping verbs EngGetWordProp(EngSent Sentlinef], int VerbFlagf], int wordcount,mt phraseflag[],CString Input); Used for checking the property of the words. EngPhrases(EngSent Sentlinef],int VerbFlag[],int &wordcount,CString &engph,int phraseflag[]); Checks for any kind of phrase used in the sentences and give the Hindi meaning. EngApostrophe(EngSent Sentline[], int &wordcount); Checks for the words containing apostrophes in the sentences. EngGetWord(CString Input,int length,int &wordcount,EngSent Sentline[]); Stores the words in a variable from the sentence. EngNormalize(CString Input,char &NL); Makes the sentence in the proper format for translation. engPreTranslate(CString EngSentence); Function checks for special symbols used in the sentences, e.g. @sign, Ssign etc. and converts them to appropriate Hindi symbols. nngCheckSortName(CString sentence); Checks for abbreviations used in the sentences. * EnglishToHindi(CString Input); Internally calls other functions such as EngPretranslate (), EngPhrase (). EngGetWordProp () etc. * EngNameWithDot(CString sentence[], int sentcount); Checks for the names with dots, e.g. Mr. P.L.Singh. * IsPunctuation(CString Punct); Checks for the punctuation. * ChangeCursor(BOOL flag); Changes the cursor shapes according the action taken by the user. * MakeRightSent(CString sent); Corrects the structure of the sentence. I claim: 1. A translation system from a first language to second language comprising: the pre-translation engine for separating symbols, decimal, integers, abbreviations hereinafter referred to as a special words and numerical and roman numbers from the input text; the one output of said pre-translation engine is connected to a main translation engine for identification of sentences, identification of phrases and identification of words, thereafter checking the said sentences with linguistic properties and semantic analysis from the multiple dictionaries in at least one of Data Bank means; the second output of said pre-translation engine for separating symbols, decimal integers, abbreviations, numerical and roman numbers, is connected to one input of post translation engine for assembly in the final output in the second language; the output of the main translation engine is connected to another input of the post translation engine for final assembly to produce the desired output in the special second language; and if desired, the final assembly of the output is re-passed for self-checking through the special Data Bank means for further accuracy. 2 The translation system as claimed in claim 1 wherein the pre-translation engine comprises means for checking symbols, the output of said means is connected to the means for checking decimal, integers, abbreviations and means for checking numerical numbers and roman numbers, the arrangement between the said means is such that when the sentence is passed through the pre-translation engine, the symbol abbreviation, decimal, integer, numerical numbers and roman number are separated 3 The translation system as claimed in claim 1, wherein the main translation engine comprises means for identification of sentences, the said means is connected to the means for identification of phrases and means for identification of words. 4 The translation system as claimed in claim 1 wherein the means for identification of phrases is connected to multiple Data Bank of different domains for checking whether the phrase belongs to Agriculture, Science & Technology and legal Banking. 5. The translation system as claimed in claim 1 wherein the means for identification of words is connected to multiple Data Bank of multiple domains for identification of words, and the different domains upto three can be handled by the said multiple data bank at a time. 6 The translation system as claimed in claim 1 wherein the output of the multiple Data Bank is connected to transliteration means if meaning of the word is not found, the output of the transliteration means is connected to a refined translation means for selecting the meaning of the words. 7 The translation system as claimed in claim 1 wherein the output of the multiple Data Bank is connected to identification of grammar property meaning of the word, if the word meaning is found, the said identification of grammar property of the word means is connected to refined translation means for selecting the appropriate word. 8. The translation system as claimed in claim 1 wherein the output of said refined translation means and the phrases means is connected to identification means to analyze the type of sentences namely; imperative, interrogative and negative sentences. 9. The translation system as claimed in claim 1 wherein means to create templates in the first language and their output in desired second language is provided. 10. The translation system as claimed in claim 1 wherein the output of said identification means is connected to means for checking complex and compound sentences based on number of conjuctions, the output of said checking means for complex and compound sentences is further connected to means for checking relative, adjectival and adverbial classes, if simple or complex compound sentences are found, the simple sentence is extracted from the said complex sentences and are passed on along with other simple sentences to means to decide the tense of the sentences using means for processing of verb depending on the tense of the sentences, the output of the means for processing of verb is finally connected to means for final formatting of the sentences depending upon subject, object and putting them in right places. 11. The translation system as claimed in claim 1 wherein the post translation engine consists of means for assembling and re-arranging sentences in Hindi sentence structure for final output. 12. The translation system as claimed in claim 1 wherein the said dictionaries include general dictionary, agricultural dictionary, technical and scientific dictionaries and Banking dictionary with classification of linguistic rule and final analyze of part of speech and their various forms 13 The translation system as claimed in claim 1 wherein the linguistic rules are 2000-2003, as herein described are provided. 14. A computing system wherein the translation system as claimed in preceding claims is included. 15. An ASIC including a translation system as claimed in preceding claims. 16. A translation system from a first language to second language substantially as herein described with reference to and as illustrated in the accompanying drawings. |
---|
647-del-2000-complete specification (granted).pdf
647-del-2000-correspondence-other.pdf
647-del-2000-correspondence-po.pdf
647-del-2000-description (complete).pdf
Patent Number | 217187 | ||||||||
---|---|---|---|---|---|---|---|---|---|
Indian Patent Application Number | 647/DEL/2000 | ||||||||
PG Journal Number | 37/2008 | ||||||||
Publication Date | 12-Sep-2008 | ||||||||
Grant Date | 26-Mar-2008 | ||||||||
Date of Filing | 14-Jul-2000 | ||||||||
Name of Patentee | CHOWDHURY, ANJALI ROY | ||||||||
Applicant Address | C/O SUPERTECH SOFTWARE & HARDWARE PVT. LTD., G-1305, CHITTARANJAN PARK, NEW DELHI-110 019, INDIA. | ||||||||
Inventors:
|
|||||||||
PCT International Classification Number | G06F 17/28 | ||||||||
PCT International Application Number | N/A | ||||||||
PCT International Filing date | |||||||||
PCT Conventions:
|