View Single Post
  #4  
Old 5th November 2013, 19:30
shadowfox shadowfox is offline
Member
 
Join Date: Oct 2013
P2P
Posts: 2
Default
Quote:
Originally Posted by DeNeDe View Post
well of course it uploads duplicate torrents if the name is changed even with a character.. let's say you have MyTorrent.h264, then someone uploads Mytorrent-h264/MyTorrent-H264.
if the name isn't the same as previous torrent uploaded it will be a duplicate
i have changed in takeupload the punctuation and spaces.
basically I inversed the code. where is space to be a punctuation so every torrent will get MyTorrent.h264.xxx.aac
I don't think the torrent name would be much of an issue, that doesn't change the info_hash, so dupes would be caught. I've already altered the code to change spaces and underscores to dots, as that seems to be the best for RSS users to filter with.

Quote:
Originally Posted by joeroberts View Post
Just do a check info_hash.
they may change the name but that well stay the same.
That is mostly true, and i toyed with it a bit, but then I realized the real problem is which tool people use to create torrents with and what settings they use. The piece size of the chucks in the torrent alter the info_hash even if the file name and torrent name remain exactly the same. The system treats this as an entirely new torrent.

I'm wondering if there is a easy way to cross reference multiple values and form a better detection routine?

1. Don't allow duplicate hashes, this will reduce some dupes
2. Don't allow duplicate torrent names, maybe unless the new file size is larger?
3. Don't allow a file with the same size which loosely matches the same file name?

No one method seems effective on its own. I wonder if creating an unique index on the size column would be too drastic? There's already one on info_hash and the torrent name, any idea what the odds of the file size being exactly the same is?

Last edited by shadowfox; 5th November 2013 at 20:25.
Reply With Quote