Ticket #1260 (closed task: fixed)

Opened 13 years ago

Last modified 13 years ago

Evaluate conversion tools

Reported by: tarmo Owned by: laszlo
Priority: critical Milestone: 1.9
Component: generic Version:
Keywords: Cc:
Time planned: 8h Time remaining:
Time spent: 5h


(story #1048, story #558) Locate and evaluate tools that can convert audio from anything to mp3, videos from anything to flv and office presentations into images.

Change History

comment:1 Changed 13 years ago by laszlo

  • Owner changed from anonymous to laszlo
  • Status changed from new to assigned
  • Time planned set to 8h

comment:2 Changed 13 years ago by tarmo

  • Priority changed from blocker to critical

comment:3 Changed 13 years ago by laszlo

  • Status changed from assigned to closed
  • Time spent set to 5h
  • Resolution set to fixed

For media conversion, I think, the best solution is ffmpeg. It supports many vdeo and audio formats. It can convert audio files to mp3 (I've tested with wav format), and video files to flv (I've tested with avi format).

For image conversion, the traditional solution is ImageMagick?, I think we should also use it.

For PPT to image conversion, we could use openoffice. There are to ways to use it. The first is makeing a conversion macro, and run it with OpenOffice? in scilent mode. Two example for this: http://www.ooomacros.org/user.php#94879 http://www.xml.com/pub/a/2006/01/11/from-microsoft-to-openoffice.html

An other (and better) solution is using Python. There is a python deamon, which can runs OpenOffice? instances in server mode. The name of this daemon is ood http://udk.openoffice.org/python/oood/. Python scripts can access this OpenOffice? service, and run commands on it. With tis solution, we could write the conversion routines in python.

All of this conversion methods can be very slow, for this reason we should run these in separated processes (like Szabi has suggested).

So, I think these information is enough for the begining, and we can start to develop this features. I think, the last (PPT to images conversion) is the most important one.

comment:4 Changed 13 years ago by szabolcs

I think media conversion should be separated from LeMill as it doesn't have to do much with learning. This way it could be developed separately and other projects would probably find it useful, too. The first -- stupid -- implementation could be through http for interoperability which could be refined to work with shared memory for efficiency.

Another thing about ppt: first, we should implement this 'ppt to image conversion' as this is the easiest to work with ppt's. After this is done, we should talk about extracting text (and other objects) with some basic layout to achieve some unified look and searchability. The same with .doc files.

Note: See TracTickets for help on using tickets.