Quote:
Originally Posted by DeNeDe
well of course it uploads duplicate torrents if the name is changed even with a character.. let's say you have MyTorrent.h264, then someone uploads Mytorrent-h264/MyTorrent-H264.
if the name isn't the same as previous torrent uploaded it will be a duplicate
i have changed in takeupload the punctuation and spaces.
basically I inversed the code. where is space to be a punctuation so every torrent will get MyTorrent.h264.xxx.aac
|
I don't think the torrent name would be much of an issue, that doesn't change the info_hash, so dupes would be caught. I've already altered the code to change spaces and underscores to dots, as that seems to be the best for RSS users to filter with.
Quote:
Originally Posted by joeroberts
Just do a check info_hash.
they may change the name but that well stay the same.
|
That is mostly true, and i toyed with it a bit, but then I realized the real problem is which tool people use to create torrents with and what settings they use. The piece size of the chucks in the torrent alter the info_hash even if the file name and torrent name remain exactly the same. The system treats this as an entirely new torrent.
I'm wondering if there is a easy way to cross reference multiple values and form a better detection routine?
1. Don't allow duplicate hashes, this will reduce some dupes
2. Don't allow duplicate torrent names, maybe unless the new file size is larger?
3. Don't allow a file with the same size which loosely matches the same file name?
No one method seems effective on its own. I wonder if creating an unique index on the size column would be too drastic? There's already one on info_hash and the torrent name, any idea what the odds of the file size being exactly the same is?