TAUS spoke with Eric Yu, CEO of GTCOM, Global Tone Communication Technology Co., Ltd, the world’s largest services provider in the combined field of translation, big data and artificial intelligence.
spoke with 對話
TAUS? 翻譯自動化用戶協(xié)會Translation automation user association
Technology Co., Ltd? 科技股份有限公司
TAUS(翻譯自動化用戶協(xié)會)對話中譯語通科技股份有限公司(“中譯語通”)CEO于洋。中譯語通是行業(yè)領(lǐng)先的語言季惩、大數(shù)據(jù)和人工智能服務(wù)供應商哲银。
At the core of the conversation lies this question: “What is the single biggest lesson that you have learnt about the translation industry?’’
對話首先圍繞一個核心問題展開—TAUS:“您對翻譯行業(yè)最重要的心得體會是什么师骗?”
What I have learnt most about the translation industry as such is that it can be considered as just one part of a much more all-encompassing “industry” that we call cross-language big data.
我認為最主要的是:翻譯行業(yè)是一個綜合性“行業(yè)”的一部分缔御,這個“行業(yè)”就是我們所說的“跨語言大數(shù)據(jù)
I come from a background deeply rooted in the translation industry. Having majored in conference interpreting, I worked for China Translation Corporation, served as the Head of Conference Interpreting, CMO, and Assistant President and then Vice President of CTC /(此處可斷句翻譯)which is the largest LSP in China. I was appointed as the CEO of GTCOM when it was incorporated as a subsidiary of CTC in 2013. What I have realized in this career is that today many related language phenomena can converge in this concept of language as data in a world with artificial intelligence.
我本身跟翻譯行業(yè)淵源頗深(come from a background deeply rooted in the translation industry.)。我學的是會議口譯專業(yè)吆鹤,畢業(yè)后加入中國對外翻譯有限公司(“中譯”)牌捷,先后擔任(當過很多職位,可以用先后)首席同傳赡磅、會議口譯部主任魄缚、市場總監(jiān)、總經(jīng)理助理和副總經(jīng)理等職務(wù)焚廊。中譯是中國最大的語言服務(wù)供應商(LSP language service provider)冶匹,2013年,中譯成立子公司(subsidiary)中譯語通咆瘟,我被任命為CEO嚼隘。這些年中( in this career),我認識到許多相關(guān)的語言現(xiàn)象都可以作為語言數(shù)據(jù)歸結(jié)(converge in this concept of language )到人工智能這個行業(yè)中袒餐。(補范疇詞)
As we all know, machine translation is one of the most complicated parts of NLP and artificial intelligence. We call it “the jewel in the crown” of NLP. (We started to invest deeply in R&D for machine translation in 2014, giving us a chance to better understand NLP, speech recognition, big data and artificial intelligence.
眾所周知嗓蘑,機器翻譯是自然語言處理(natural language processing)(NLP)和人工智能(AI)最復雜的部分,我們稱之為“皇冠上的寶石”匿乃。2014年(時間在前)桩皿,我們開始投入大量資金(invest deeply)進行機器翻譯研發(fā)(research and development),因此對NLP幢炸、語音識別(speech recognition)泄隔、大數(shù)據(jù)和人工智能有了(giving us a chance)更深刻的認識(better understand )。
Then in October 2015, we put forward the “cross-language big data” concept, which basically involves managing data on an Internet scale. For example, instead of searching in English or Russian or Chinese using input terms and finding content only in those languages, what if we simply let people input a word in their language but(翻譯順序發(fā)生改變) eliminated all those language labels? That way we would be able to have access to all relevant results in any language - Chinese, English, Russian, German and so on. And what if we were then able to analyze all those data quantitatively and qualitatively? That is the essence of our cross-language big data concept.
2015年10月宛徊,我們提出了“跨語言大數(shù)據(jù)”的概念佛嬉,這一概念基本上包含了網(wǎng)絡(luò)規(guī)模上的數(shù)據(jù)管理逻澳。如果(instead of……only find? 并沒有)我們用英語、俄語或中文進行搜索(using input terms)暖呕,得到的只能是相應語種的內(nèi)容斜做,那么如果消除所有語言標簽,只是簡單地讓大家輸入( a word in their language)他們自己的語言湾揽,結(jié)果會怎么樣呢瓤逼?這樣我們可以獲取(have access to )中文、英語库物、俄語霸旗、德語(中文先分后總)及其他任意一種語言的所有相關(guān)結(jié)果。如果我們能夠定量戚揭、定性(quantitatively and qualitatively)地分析所有這些數(shù)據(jù)诱告,又會怎么樣呢?這就是我們跨語言大數(shù)據(jù)概念的實質(zhì)所在(essence)民晒。
Naturally this involves using real-time machine translation. But that is only a kind of “l(fā)anguage switch.” I began to wonder what we could achieve if we could analyze all the texts discovered from one search term, or from one piece of news returned to the user – analyze the persons involved, the time, location, entities, and all the other knowledge and information contained in that news item or document? What if we then extended this search for data by being able to grasp all the data from the past ten or twenty years and analyze them in the same way - qualitatively and quantitatively? That capability is lacking, so we are trying to offer it through this concept.
從本質(zhì)上來說(Naturally)精居,這涉及到實時機器翻譯的應用(using 將動詞翻譯為名詞,靈活使用)潜必,但這僅僅是一種“語言轉(zhuǎn)換”靴姿。我們開始思考(wonder),假如能夠從一個術(shù)語搜索到所有相關(guān)文本(analyze all the texts discovered from one search (名詞做動詞刮便,因此可以不譯discovered from空猜,抓強勢動詞) term),或者能夠根據(jù)用戶收到(returned to)的一條新聞來分析相關(guān)人員恨旱、時間辈毯、位置、實體以及包含的其他知識或信息(contained in that news item or document?(這個重復了上述的新聞搜贤,英文重復性強谆沃。因此要適當刪減)),會有怎樣的結(jié)果仪芒?如果我們能夠擴大數(shù)據(jù)搜索范圍(擴大后面搭配為范圍唁影,可以說是范疇詞,而不能說擴大搜索掂名,注意動賓搭配)(extended this search for data)据沈,獲取近10年或20年的所有數(shù)據(jù),并以同樣的方式對其進行定量定性分析饺蔑,又會有怎樣的結(jié)果锌介?目前這個領(lǐng)域是空白(capability is lacking,(能力是欠缺的,翻譯出來不順】谆觯可以譯為領(lǐng)域空白隆敢,上文就是圍繞這個領(lǐng)域開展)),所以我們試圖來實現(xiàn)(offer it(賦予這個能力就是實現(xiàn)崔慧,注意靈活性))這一概念拂蝎。
How does your background and interest in interpretation fit into this vision?
您是如何將口譯的(狀語可以當成修飾詞提前)專業(yè)背景和興趣與這一理念聯(lián)系到一起(fit into (fit into原來意思是融入一體))的呢?
In early 2013, we launched our Global Multilingual Call Center. Previously in 2008 when I was a member of the Global Advisory Committee on language line services, I proposed that we try to see whether simultaneous interpreting could be provided over the phone. At that time it seemed like a new idea, and it still is.
在2013年出初惶室,我們推出了了國際多呼叫中心這一温自。早在2008年(Previously in 2008),我在奧運會全球顧問委員會(Global Advisory Committee on language line services,)擔任委員( a member of中文喜歡用動詞拇涤,因此不說我是……的成員捣作,而是擔任……職務(wù))的時候誉结,就提出我們應該試試通過電話進行同聲傳譯(simultaneous interpreting could be provided over the phone. 被動改主動)鹅士。那時候這個想法很超前(a new idea),事實上現(xiàn)在也是惩坑。
So we started to build our multilingual call center in 2012 and when we first started GTCOM as an independent company, we launched our Global Multilingual Call Center. Following recent developments, I think this is now the largest Call Center in China. It provides services in 12 languages and receives an average of 500,000-minutes of calls per month from all over the world. For example, we have more than 400 interpreters working exclusively for China UnionPay. Moreover, we also provide big data analysis. Every month, we analyze all the call big data and generate reports, so this Center is no longer just a call center in the traditional sense, but increasingly a big-data center.
因此掉盅,2012年,我們開始籌備(過渡的詞語)成立多語言呼叫中心(成立中心而不是建立中心)以舒。2013年初趾痘,中譯語通作為獨立公司成立(when we first started GTCOM as an independent company,此處省略了we,中心作為主語蔓钟,也是中文的常見翻譯法)永票,我們設(shè)立了全球多語言呼叫中心。近幾年滥沫,它已經(jīng)發(fā)展為中國最大的呼叫中心(Following recent developments 時刻記住詞性改變)侣集,提供12種語言服務(wù),平均每月收到全球各地客戶50萬分鐘的呼叫兰绣。比如世分,我們有400余位口譯員專門(exclusively)為中國銀聯(lián)(China UnionPay)提供服務(wù)。此外缀辩,我們還為客戶提供大數(shù)據(jù)分析臭埋。每個月我們都對所有呼叫大數(shù)據(jù)進行分析(analyze)并生成報告(generate reports 不是產(chǎn)生報告而是生成報告),從這點上講(這是中文另外加入的一點臀玄,讓表達更加通暢)瓢阴,呼叫中心已經(jīng)不是傳統(tǒng)意義上的呼叫中心,而是正在發(fā)展為一個大數(shù)據(jù)中心健无。
In terms of size and volume how would you compare GTCOM to its closest competitors?
從規(guī)模(size and volume 大小和數(shù)量可以歸結(jié)為規(guī)模)上講(In terms of(翻譯成從……上講))荣恐,您如何將中譯語通與你們最強勁的對手進行對比?
As I said, GTCOM is now a big data company and far bigger than any other translation companies in China – and maybe in many other countries!
如我所說睬涧,中譯語通現(xiàn)在是一家大數(shù)據(jù)公司募胃,規(guī)模及業(yè)務(wù)范圍(英文中省略的主語在中文中要補充)遠遠超過(far bigger than 不要簡單翻譯大的多)中國任何一家翻譯公司旗唁,也許還超過許多其他國家和地區(qū)的翻譯公司。
We started working on machine translation, for example, in 2014 and on big data in 2015. Each year we have invested on average about USD 30 million on R&D in machine translation. We have already filed a dozen patents for MT-related technologies. And our machine translation supports a total of 33 different languages, including Chinese, English, German, French, Japanese, Arabic, Portuguese, Russian, Korean, etc. Among them, about 25 languages can be translated with our own Neural Machine Translation engine.
例如痹束,我們從2014年開始致力于(working on )機器翻譯检疫,2015年開始開展大數(shù)據(jù)業(yè)務(wù)(英文中常常用介詞來代替動詞。因此翻譯成中文時要具體根據(jù)名詞來選擇搭配選擇祷嘶,此處的開展業(yè)務(wù)就是一例)屎媳,平均每年投入3000萬美元用于研發(fā)。公司已經(jīng)申請了十幾項機器翻譯技術(shù)(MT-related technologies)的專利(filed a dozen patents)论巍。目前烛谊,我們的機器翻譯支持33種語言,包括中文嘉汰、英語丹禀、德語、法語鞋怀、日語双泪、阿拉伯語、葡萄牙語密似、俄語焙矛、韓語等;其中残腌,25種語言采用了我們自主研發(fā)的(own)神經(jīng)網(wǎng)絡(luò)機器翻譯(Neural Machine Translation engine.)技術(shù)(with? 看到介詞要注意是否翻譯成動詞)村斟。
In addition to our translation and Call Center activities, we also provide video localization services, which are part of our translation services in China. About 85% of the work in this area is carried out using our YeeCaption toolkit, a one-stop smart subtitle translation software.
除了(In addition to)翻譯業(yè)務(wù)和呼叫中心(activities),我們還提供視頻本地化(video localization)服務(wù)抛猫,其中(in this area)約85%的內(nèi)容是通過我們的一站式智能字幕(后面補充語蟆盹,可以用破折號來解釋)翻譯工具——字幕通(YeeCaption)完成的。
Let me give you an example of a video job from March 2017. Our client planned to launch a short video clip business on a platform rather like YouTube /which introduced huge amounts of videos from overseas into China. So we mobilized nearly 700 translators and localized about 830 hours of multilingual videos from a wide variety of content categories in just ten days. On top of the localization, we have become an IP provider, signing up the IP copyright partnerships with 37 of the biggest IP providers overseas, making us the exclusive operator for their video content in China. So once again, we have decided to go beyond the technical art of localization, and work in distribution, and production as well.
我來舉個視頻翻譯(video job邑滨,英語為了避免重復常用代替的詞語日缨。要注意辨別)的例子。2017年3月掖看,我們的客戶從海外引進海量短視頻匣距,計劃在類似YouTube的平臺上推出短視頻業(yè)務(wù)(launch a short video clip business)。這些視頻種類多樣哎壳、內(nèi)容豐富毅待,并且涉及多個語種。我們啟用了將近700位翻譯归榕,僅用10天時間就完成了時長約830個小時的視頻本地化翻譯(localized)尸红。此外(On top of the localization),我們已經(jīng)成為IP供應商,與海外37家最大的IP供應商簽署了版權(quán)合作協(xié)議(signing up the IP copyright partnerships )外里,成為他們在中國的獨家(exclusive)經(jīng)營商怎爵。因此,我們已經(jīng)超越了本地化的技術(shù)服務(wù)(technical art)盅蝗,進一步向發(fā)行與制作(distribution and production)方面進軍(in又是介詞改動詞)鳖链。
Your ambitions are clearly far greater than just providing a translation service. Do you feel ready to take on technology companies such as Google yet?
中譯語通追求的(Your ambitions are clearly ……的雄心壯志就等于追求的)已經(jīng)不僅僅(far greater than)是提供翻譯服務(wù)了。您認為你們已經(jīng)做好準備與Google這樣的技術(shù)公司進行競爭(take on 說實話墩莫,有點不明覺厲)了嗎芙委?
Unlike Google and other MT technology providers, we provide a domain-specific MT engine. In certain fields, our MT delivers higher quality as it can be tailored to news domains such as financial news, military news and others. In addition to our machine translation, we can collect data in about 65 different languages from 200+ countries, and keep updating this daily. Globally, we regularly update about 30 million articles and 500 million social media messages daily. According to data analytics firm Palantir, this is estimated to be worth $20 billion.
我們(每句話先想一想主語)與Google和其他機器翻譯技術(shù)供應商不同,我們提供特定領(lǐng)域(domain-specific)的機器翻譯引擎狂秦。比如在金融灌侣、軍事等領(lǐng)域可以進行定制(be tailored to),以便提供更高質(zhì)量的服務(wù)裂问。除了機器翻譯侧啼,我們還能收集全球200多個國家和地區(qū)的65種語言,并且每天進行更新愕秫。全球范圍內(nèi)(Globally)慨菱,我們的新聞日更新和處理能力達3000多萬篇焰络,社交數(shù)據(jù)日更新和處理能力達5億條戴甩。data analytics firm數(shù)據(jù)分析公司
For us, cross-language big data is unstructured open-source data which are all related to open source news and social media such as Twitter, Facebook, and WeChat and Weibo in China. We can analyze each piece of unstructured data. Our current line of big data products includes JoveBird, which is designed to take advantage of big data and AI technologies to offer financial investment solutions. Using a set of financial analysis models and powerful cross-language big data processing capability, it helps investors analyze stock-price trends and strategize their investments.
對我們而言,跨語言大數(shù)據(jù)是非結(jié)構(gòu)化開源數(shù)據(jù)(unstructured open-source data )闪彼,包括開源新聞數(shù)據(jù)以及Twitter甜孤、Facebook、微信畏腕、微博等社交媒體數(shù)據(jù)缴川。我們可以分析每一條(each piece of)非結(jié)構(gòu)化數(shù)據(jù)。比如我們的大數(shù)據(jù)產(chǎn)品JoveBird描馅,通過一系列金融分析模型和強大的跨語言大數(shù)據(jù)處理能力把夸,可以幫助投資人分析股價趨勢并制定投資戰(zhàn)略(strategize their investments. 制定戰(zhàn)略 動詞改名詞)。
We are indeed seeking new partners and targets, for example in localization. We’re also looking at advertising, consulting, and big data companies. China is already a huge market, and we are able to respond to the huge local-market demand of our customers. On the other hand, we have more than 10 products in our big data line-up, with the big data analytical platform for social media, and a news toolkit. We also have an industrial big data platform, a mining platform and a series of other big data platforms - all leading big data technologies. So what we hope is that together with our target partners, we will be able to provide them with cutting-edge data technologies, and help them first explore the Chinese market and later the global market.
事實上铭污,我們很希望與全球伙伴合作(seeking new partners and targets)恋日,localization 本地化包括廣告公司、咨詢公司和大數(shù)據(jù)公司等等嘹狞,幫助他們在中國市場發(fā)展岂膳。中國是一個很大的市場,我們能夠很好地對客戶千變?nèi)f化的市場需求做出回應磅网。(we are able to respond to (很好的做出回應)the huge(翻譯成了千變?nèi)f化) local-market demand of our customers. )我們現(xiàn)在擁有10余個(more than 10 products超過十個不怎么順溜谈截,用十余個比較好)大數(shù)據(jù)產(chǎn)品,面向(for 翻譯成面向social media)政府、企業(yè)簸喂、新聞媒體等行業(yè)的大數(shù)據(jù)分析平臺毙死,還擁有工業(yè)大數(shù)據(jù)、數(shù)據(jù)采集平臺(mining platform)及其他大數(shù)據(jù)平臺喻鳄。我們希望能與我們合作伙伴一起规哲,運用我們最先進的數(shù)據(jù)技術(shù)(cutting-edge data technologies),來幫助他們打開(first explore)中國市場并擴展(later一個later也要保證動詞诽表,英文常常省略)全球市場唉锌。
Is translation slowly becoming a smaller part of your business while big data and AI grow bigger?
你們的大數(shù)據(jù)和人工智能業(yè)務(wù)發(fā)展越來越大(while斷句,一般后面為前提竿奏,先翻譯后面的)袄简,翻譯業(yè)務(wù)占據(jù)的比例是否會逐漸縮小(slowly becoming a smaller part of 英語常用名詞,中文的話可以翻譯成動詞)泛啸?
Yes, our language services are playing a smaller role in overall GTCOM revenue, but at the same time, this market is set to grow significantly. Language services are at a completely different market level compared to big data and AI which are growing much faster. What I’d like to highlight is that the language industry should be developed in an artificial intelligence direction, not in the traditional way. As I said in my keynote at the FIT congress in Australia, we must be open to AI and all work and grow together with it for the future of the language industry. We work, for example, with such large players as Haier, GE, Alibaba and many other industry clients. Beneath our entire big data platform, we now have a language technology infrastructure. So our big data platform can handle our key asset - cross-language big data.
目前绿语,語言服務(wù)在中譯語通整體收入中的確只占據(jù)很小一部分(are playing a smaller role 同上也是名詞改動詞的方法),但未來候址,這個市場會有極大的發(fā)展吕粹。與發(fā)展迅速的大數(shù)據(jù)和人工智能相比,語言服務(wù)市場同前沿技術(shù)的融合相對緩慢(are at a completely different market level? 這里難以理解岗仑,我只會翻譯成兩者的水平并不在同一個層面上)匹耕。我想強調(diào)的是語言行業(yè)應該借助人工智能,而不是繼續(xù)遵循傳統(tǒng)方式( in the traditional way 動詞和名詞轉(zhuǎn)換)荠雕。就像我在澳大利亞的世界翻譯大會(FIT)主題演講中提到的一樣稳其,我們必須迎接( be open to 迎接就是開放懷抱)人工智能,與其共同成長炸卑,為語言服務(wù)創(chuàng)建更好的未來(for the創(chuàng)建(動詞) future of the language industry.)既鞠。我們之所以可以與海爾、通用電氣(GE general electric)盖文、阿里巴巴等這樣的行業(yè)巨頭進行合作(clients 有客戶即與……合作)嘱蛋,是因為(之所以……是因為,是根據(jù)原文邏輯關(guān)系推出的)我們擁有整個大數(shù)據(jù)平臺背后的語言技術(shù)五续,也正因為如此(so)洒敏,我們的平臺才能處理跨語言大數(shù)據(jù),這就是(破折號的作用返帕,順序相反)我們的核心價值桐玻。(key asset - cross-language big data.)
That’s the future. Take our industrial big data platform as an example, all those language technologies have been automatically embedded in the big data platform. Underneath the platform we have the language technology infrastructure. In this way, we have established a link between language and data, initiating a brand new future for the language service industry.
我們可以預見未來(That’s the future. ),以我們的大數(shù)據(jù)行業(yè)平臺為例荆萤,所有這些語言技術(shù)已經(jīng)自動嵌入(embedded in)大數(shù)據(jù)平臺镊靴。通過這種方式铣卡,我們已經(jīng)將語言和數(shù)據(jù)連接起來(established a link 不說建立聯(lián)系,不像中國話偏竟,名詞改動詞)煮落,為語言服務(wù)行業(yè)打造全新的未來(initiating a brand new future(打造未來,愿意為開始踊谋,要靈活翻譯) )蝉仇。