1 | | [wiki:Video] |
| 1 | = OpenSubtitles v2 draft specification - Request for comments == |
| 2 | |
| 3 | |
| 4 | == Subtitles section == |
| 5 | |
| 6 | We are trying to avoid duplicate subtitles as much as possible, so in ideal world, |
| 7 | there should be only one subtitle for many releases. We try to approach this by using |
| 8 | SubLib. So, in system there will be saved only 1 subtitle for each version of movie: |
| 9 | [Matrix], [Matrix - Extended Cut], [Matrix - Directors Cut] and so on. This is ideal situation |
| 10 | and current world is not ideal. |
| 11 | |
| 12 | We know, there are many versions of rip, so in ideal world, there should be just some |
| 13 | rules, how to change original subtitles to fit them to the movie version. For example: |
| 14 | 1. Take [Matrix] subtitles id 123456 |
| 15 | 2. Change frame-rate from 25 FPS to 23.978 FPS |
| 16 | 3. Add 3 seconds to the beginning |
| 17 | 4. Cut subtitles at 1 hour 25 minutes 45 seconds 78 milliseconds and make 2 files |
| 18 | |
| 19 | With these rules we can represent any version, and hopefully any needs for movie. |
| 20 | One big advantage of this system is wiki-style of subtitle editing, so, you look |
| 21 | movie, you will find some bad translation or typo, you login to the site and edit |
| 22 | these subtitles online (or using some program implementing our API). |
| 23 | All these changes will be present for the all versions. |
| 24 | |
| 25 | So in the system will be one master subtitle for each version of movie (original version, |
| 26 | directors cut). This will be beginning and in database we will got rules, how to "change" it |
| 27 | for different movie rips. So, master subtitle is present in database and by analyzing |
| 28 | timestamps of other uploaded subtitles we got rules for retiming it to the another movie |
| 29 | rips, thats theory. |
| 30 | |
| 31 | Wiki editing - users should be able to edit and translate subtitles online, with versioning |
| 32 | system. All changes will be tracked, so in the final we will have in system how many changes |
| 33 | was done by each user. |
| 34 | |
| 35 | Subtitles export - subtitles in the system are saved as metadata, so user can choose any |
| 36 | subtitle format as he want. |
| 37 | |
| 38 | Realtime Re-timing, cutting - SubLib supports re-timing, cutting, "moving" subtitles, so |
| 39 | this should be done also online and via API. |
| 40 | |
| 41 | == Movie section == |
| 42 | |
| 43 | Implement more than one website for movies, now is implemented only imdb.com, which is not |
| 44 | bad, but they don't provide any official API access to their database. That's why there is need to |
| 45 | implement sites like themoviedb.org and tvseries.org (?). |
| 46 | |
| 47 | Movie hashing - there is little need for stronger hash, which need some research, how to |
| 48 | done it properly, because current implementation (CRC64) is weak and can lead to collisions in |
| 49 | future (so far there was no collisions). Ideally, system should be coded for more kind of hashes. |
| 50 | I think wrong idea is put into hash information such is movielength, fps, dimensions of movie |
| 51 | and such. Hash should be only file dependent, for example first and last 128kb (sha1), and |
| 52 | filesize together hashed (sha1). |
| 53 | |
| 54 | Media information - in database there is need to save such information, but problem is |
| 55 | implementation can be different in programs. The most important is FPS. |
| 56 | |
| 57 | == User Section == |
| 58 | |
| 59 | Registration - simple as possible - UserName, Email, Password, possible login using social sites, |
| 60 | openid and so on (rpxnow.com). |
| 61 | |
| 62 | Groups and permissions - similar like in current version of opensubtitles. There are permissions, |
| 63 | groups got 1 or many permissions, and user can belong to 1 or more groups. |
| 64 | |
| 65 | == Translator Groups == |
| 66 | |
| 67 | There should be some _good_ support for subtitle translators and their groups. This need more |
| 68 | research how to done it properly. |
| 69 | |
| 70 | == Website Translation == |
| 71 | |
| 72 | Current system of website translation is OK. |
| 73 | |
| 74 | == API Access == |
| 75 | |
| 76 | Only registered useragents will have API access by using their key. API should be provided |
| 77 | by different standards such as XML-RPC (current), REST, JSON... |
| 78 | |
| 79 | == Caching == |
| 80 | |
| 81 | |
| 82 | |
| 83 | == Software specification == |
| 84 | |
| 85 | Lighttpd as http server |
| 86 | Postgre SQL as database server, Python as programming language, |
| 87 | Django as framework, Memcache for memory caching, (research) for subtitle caching, |
| 88 | |
| 89 | Sphinx-search for fulltext search |
| 90 | |
| 91 | |
| 92 | Study |
| 93 | ===== |
| 94 | X-Send-File |
| 95 | ako cachovat titule - db, filesystem,...? |