Does anybody know how to translate a document in .PDF format Thread poster: Cécile Trotin (X)
| Cécile Trotin (X) Local time: 17:58 English to French + ...
Thanks to Trados, I have been able to translate various-format documents without too many problems but I still wonder if there is a way to translate PDF documents keeping this format and, if yes, how to do it? | | | insider tip : Wordfast does it | Feb 3, 2002 |
Hi Cécile, Just try Wordfast (http://www.champollion.net) which converts the content of a pdf file into word. This feature is undocumented in Wordfasts manual, so search in the archives of the Wordfasts group (http://groups.yahoo.com/group/wordfast). However, if you want to generate a pdf file, you ... See more Hi Cécile, Just try Wordfast (http://www.champollion.net) which converts the content of a pdf file into word. This feature is undocumented in Wordfasts manual, so search in the archives of the Wordfasts group (http://groups.yahoo.com/group/wordfast). However, if you want to generate a pdf file, you need the corresponding tool (acrobat distiller, or free conversions tools).
Regards
Samy [addsig] ▲ Collapse | | | .pdf = Acrobat | Feb 3, 2002 |
Hi! If you have a .pdf file it was created using the full version of Acrobat (i.e. not just Acrobat reader). I don\'t have the full version (yet!), so I can\'t tell you if Trados will work with that format.
My way of dealing with it is to mark the entire document in Acrobat Reader 5.0 using Ctrl+A or select all, and to copy and paste it into a word document. This messes up the format , and sometimes some parts don\'t... See more | | | PDF is by nature a format for final distribution, not for editing. | Feb 3, 2002 |
PDF is by nature a format for final distribution, not for editing. Receiving it as the starting point for a translation is not that different from receiving the text on hard copy or by fax...
You can extract the text from it in various ways (select all then copy-paste as Alison says; or using the file > Save as: RTF command in Acrobat 5). Either way (unless the PDF originates from one of the new generation DTP programs able to embed document structure information in the PDF)... See more PDF is by nature a format for final distribution, not for editing. Receiving it as the starting point for a translation is not that different from receiving the text on hard copy or by fax...
You can extract the text from it in various ways (select all then copy-paste as Alison says; or using the file > Save as: RTF command in Acrobat 5). Either way (unless the PDF originates from one of the new generation DTP programs able to embed document structure information in the PDF) the flow of the text in the resulting .rtf file will probably not match the reading flow of the text in the PDF file and will need some cleaning up to remove extra carriage returns, move a few paragraphs to match the original flow, make white text black so as be able to see it, etc.
However, all PDF files originate from other programs (Word, PageMaker, FrameMaker, QXPress...) and when receiving a PDF file for translation it is always a good idea to ask the client for the original, editable, files.
I think the client should be made aware that extracting the text from PDF\'s can be problematic and time-consuming, therefore requires more time and will cost him more money... once this is explained, original files usually become readily available! ▲ Collapse | |
|
|
Very sound advice, Roberta! | Feb 3, 2002 |
Quote: On 2002-02-03 13:49, Roberta A. wrote: PDF is by nature a format for final distribution, not for editing. Receiving it as the starting point for a translation is not that different from receiving the text on hard copy or by fax...
You can extract the text from it in various ways (select all then copy-paste as Alison says; or using the file > Save as: RTF command in Acrobat 5). Either way (unless the PDF ... See more Quote: On 2002-02-03 13:49, Roberta A. wrote: PDF is by nature a format for final distribution, not for editing. Receiving it as the starting point for a translation is not that different from receiving the text on hard copy or by fax...
You can extract the text from it in various ways (select all then copy-paste as Alison says; or using the file > Save as: RTF command in Acrobat 5). Either way (unless the PDF originates from one of the new generation DTP programs able to embed document structure information in the PDF) the flow of the text in the resulting .rtf file will probably not match the reading flow of the text in the PDF file and will need some cleaning up to remove extra carriage returns, move a few paragraphs to match the original flow, make white text black so as be able to see it, etc.
However, all PDF files originate from other programs (Word, PageMaker, FrameMaker, QXPress...) and when receiving a PDF file for translation it is always a good idea to ask the client for the original, editable, files.
I think the client should be made aware that extracting the text from PDF\'s can be problematic and time-consuming, therefore requires more time and will cost him more money... once this is explained, original files usually become readily available!
This is what makes these forums so interesting; the \'camaraderie\' Marcus referred to in another forum and the willingness to help others.
Have a nice day. ▲ Collapse | | | | pdfs are the jinx | Feb 3, 2002 |
Hi,
Clients love to send us pdfs and few realize the major jinx they are for translators. If copy and pasting is not enough for you because you want to keep the formatting, try OmniPage 11 , available at http://www.scansoft.com , which is able to read .pdfs files so you later you can OCR them. But even with OP11 it\'s not a given: you have to determine table zones and make templates to make the process go s... See more Hi,
Clients love to send us pdfs and few realize the major jinx they are for translators. If copy and pasting is not enough for you because you want to keep the formatting, try OmniPage 11 , available at http://www.scansoft.com , which is able to read .pdfs files so you later you can OCR them. But even with OP11 it\'s not a given: you have to determine table zones and make templates to make the process go smoother. So, the best advice is to really stick to your guns and refuse to start working unless the client provides you with an editable file (FrameMaker, RTF, Word, what have you). Or charge at least a 100% surcharge for the extra work ahead. Or state very clearly to your client that the formatting is going to be all messed up and you\'re taking no responsibility for it.
ME
[ This Message was edited by: on 2002-02-03 15:45 ]
[ This Message was edited by: on 2002-02-04 10:14 ] ▲ Collapse | | | Mats Wiman Sweden Local time: 17:58 Member (2000) German to Swedish + ... In memoriam Gemini 2 washes the text of a PDF file out to text | Feb 3, 2002 |
with som of the formatting problems remaining but still!
http://www.iceni.com
It\'s worth it\'s price (which I don\'t remeber)
Mats Wiman | |
|
|
Erika Pavelka (X) Local time: 11:58 French to English Get the original format of the document | Feb 3, 2002 |
Quote: On 2002-02-03 13:30, alison1969 wrote: Hi! If you have a .pdf file it was created using the full version of Acrobat (i.e. not just Acrobat reader). I don\'t have the full version (yet!), so I can\'t tell you if Trados will work with that format.
My way of dealing with it is to mark the entire document in Acrobat Reader 5.0 using Ctrl+A or select all, and to copy and paste it into a word docume ... See more | | | Werner George Patels, M.A., C.Tran.(ATIO) (X) Local time: 11:58 German to English + ... PDFs must be treated in the same way as hardcopies or faxes | Feb 3, 2002 |
I agree with Roberta: PDFs may be a nice tool for companies to publish their brochures, price lists, etc., but for us they are a major headache.
I always request the original files for editing. If the client cannot provide them, the rate will be increased (same for hardcopies or faxes).
Note to Adobe and CAT programmers: please find a way to make pdf files more manageable. | | | Ralf Lemster Germany Local time: 17:58 English to German + ... Problems with 'copy & paste' / 'save as RTF' | Feb 3, 2002 |
Hi! Most info has already been provided; there\'s just a few potential pitfalls you should be aware of:
- If the PDF document has been protected in the course of the production process, you cannot copy text from it (IOW the document is locked); in that case, saving as RTF (which only works if you have the full Adobe Acrobat 5.0, not just the Reader) is also blocked in this case.
- If \'copy and paste\' works, you will still have to manually edi... See more Hi! Most info has already been provided; there\'s just a few potential pitfalls you should be aware of:
- If the PDF document has been protected in the course of the production process, you cannot copy text from it (IOW the document is locked); in that case, saving as RTF (which only works if you have the full Adobe Acrobat 5.0, not just the Reader) is also blocked in this case.
- If \'copy and paste\' works, you will still have to manually edit the file; not only will the text flow usually differ from the original layout, but you\'ll find a paragraph mark (line feed) at the end of each line. This is \"poison\" for Trados (in fact, for any TM system, since it chops sentences into pieces). This is the same behaviour as when cutting and pasting from web pages.
These are things you should take into account when pricing such a job.
HTH - Ralf ▲ Collapse | | | Cécile Trotin (X) Local time: 17:58 English to French + ... TOPIC STARTER Wordfast seems to be a really good tool | Feb 3, 2002 |
Thank you very much for your advice. I dowloaded Wordfast. It seems to be a very good tool and I advise all those who visit this forum to try it. And it\'s free! | |
|
|
Holly Hart United States Local time: 10:58 Member (2002) German to English + ... PDF is not an editing format | Feb 4, 2002 |
I agree with the others about PDF not being amenable to editing. It is meant to be a FINAL UNEDITABLE version. In fact, when scientists submit grants electronically to NSF (National Science Foundation) for government funding, the grant requests are converted upon submission to PDF format, the reason being, that it cannot be edited by anyone who will later review the grant for funding suitability or the granting agency itself. It is meant to be a protection for the author. If you have the softwar... See more I agree with the others about PDF not being amenable to editing. It is meant to be a FINAL UNEDITABLE version. In fact, when scientists submit grants electronically to NSF (National Science Foundation) for government funding, the grant requests are converted upon submission to PDF format, the reason being, that it cannot be edited by anyone who will later review the grant for funding suitability or the granting agency itself. It is meant to be a protection for the author. If you have the software to convert a document to PDF, the same software package should have the reverse procedure as well. ▲ Collapse | | | Ralf Lemster Germany Local time: 17:58 English to German + ... Re: PDF is not an editing format | Feb 4, 2002 |
I agree with what \'hwhart\' said, except...
If you have the software to convert a document to PDF, the same software package should have the reverse procedure as well.
Adobe Acrobat and the related utilities (such as Adobe Distiller) can convert almost anything into PDF - unfortunately, reverting that process does not always work. As many in this thread pointed out, getting the source file(s) is essential. | | | Hope for the future | Feb 4, 2002 |
One of the biggest problems when converting text from a PDF into RTF is, as mentioned above, the troublesome carriage returns at the end of every line.
I briefly mentioned PDFs from \"new generation DTP application\" in my earlier posting: a new way of tagging text while exporting it as PDF file has been introduced in InDesign 2 (and I think also in PageMaker 7), so that when it is exported back as RTF the text retains its original paragraph-by-paragraph structure and the ho... See more One of the biggest problems when converting text from a PDF into RTF is, as mentioned above, the troublesome carriage returns at the end of every line.
I briefly mentioned PDFs from \"new generation DTP application\" in my earlier posting: a new way of tagging text while exporting it as PDF file has been introduced in InDesign 2 (and I think also in PageMaker 7), so that when it is exported back as RTF the text retains its original paragraph-by-paragraph structure and the horrible extra carriage returns become history.
So it is only a matter of time until other applications can do the same. [Then, of course, the user must be aware of this little feature and use it...]
In the meantime, although there is no such thing as PDF conversion back to its original file format, short PDF documents (Ad pages, packaging or brochures) can be opened and edited in Illustrator and FreeHand (both are PostScript applications, hence able to understand PDF, which is also a PostScript derivate). Of course, both these applications are vector drawing applications, and need some specific skills - nothing to do with Word or DTP applications! ▲ Collapse | | | To report site rules violations or get help, contact a site moderator: You can also contact site staff by submitting a support request » Does anybody know how to translate a document in .PDF format CafeTran Espresso | You've never met a CAT tool this clever!
Translate faster & easier, using a sophisticated CAT tool built by a translator / developer.
Accept jobs from clients who use Trados, MemoQ, Wordfast & major CAT tools.
Download and start using CafeTran Espresso -- for free
Buy now! » |
| Wordfast Pro | Translation Memory Software for Any Platform
Exclusive discount for ProZ.com users!
Save over 13% when purchasing Wordfast Pro through ProZ.com. Wordfast is the world's #1 provider of platform-independent Translation Memory software. Consistently ranked the most user-friendly and highest value
Buy now! » |
|
| | | | X Sign in to your ProZ.com account... | | | | | |