PSA: Reporting Duplicates

Posts: 27 · Views: 1164
  • 8815
    About half of all reports are for duplicates and a lot of these are reported wrongly these days, so to hopefully get us all on the same page and save people some time and effort... Wallpapers are merged if the resolutions are the same or almost the same (to a margin of about 5-10%, so a 1920x1280 version would usually be merged with a 1920x1200 one). This is really just to weed out versions like 1920x1081 which are technically not identical to 1080p ones, but are utterly unnecessary. When merging, the oldest wall that's the same or almost the same resolution is what other walls get merged with. There are exceptions to this if
    1. the older wall is of visibly lower quality
    2. the older wall has had its original author's watermark removed ( and user marks don't count, kids).
    In these cases, better quality or being watermarked takes precedence over age. If an upload of yours was deleted and you don't understand why, it might be because of this. Otherwise, walls should be grouped. When this needs to be done, just report under Other and put the URL of the original wall in the info field. Walls that are already grouped shouldn't be reported as dupes without good reason. Please refer back to this if you're unsure when reporting and discuss as well - we realise it's not fun taking the time to upload walls and having them taken down for being dupes, it's not an exact science, it's work in progress, etc...
  • 8825
    First, if I see two identical wallpapers and one of them has 1920x1080, and the other let's say 3840x2160, then I should report for 'Other' and provide a link to a different sized wallpaper? Second, if two or more wallpapers are grouped because of different resolutions, where can I see other sizes? What if I see a wallpaper with Full HD size, and I know it's grouped with 2K/4K and I want bigger sized image on my desktop?
  • 8828
    @dwemer, if it's grouped then right under the resolution it has a link for "1 more size"
  • 8834
    sannukas0016 thank you, now it's clear Added 2016-06-07 16:04:31 I have another question related to groups of wallpapers. Let's say I have some wallpaper in many different resolutions - 4K, 2K Full HD and so on. Should I upload all of them or the biggest one is the best?
    Last updated
  • 8847
    The biggest one would be the best option but you could upload all of them and just group them (which I believe is automatics, not entirely sure)
  • 8857
    You should not upload all resolutions but instead only the largest and, if it that is not already a standard resolution, one standard resolution (like 1920x1080). If you upload more than that we will have to delete them manual and will likely become annoyed.
  • 8861
    I found these two Wallpapers:
    1920 x 108025
    [383001] They are minimal in Resolution. So is this a case of Duplicate ? Shall i report ?
  • 8942
    WallpaperManiac, there's an example of two walls of similar enough resolutions, so that's the kind of thing we'd like folks to report. I've already merged the second with the first.
  • 8958
    What about these?
    4500 x 316821
    [377623] They are different in sizes, they are almost the same, but different in details, should I report something like these to be grouped, or they should be completely different?
  • 8959
    @dwemer, the differences are big enough that you can't count them as 1 image.
  • 8978
    What about low effort collages like this that are just other users' uploads in non-standard resolutions?
    3711 x 174814
    2960 x 185042
    1653 x 233824
    2052 x 350714
  • 8987
    KrimzinZV, these aren't dupes but we usually consider this kind of thing as a low-quality edit, so against the rules.
  • 8989
    Please look in the Best of the Worst Thread there you will find some really good kind-of-a-thing low-quality-edits
    Last updated
  • 9444
    You don't need the public for this you could easily create an algorithm using python to compare pixel to pixel image differences to search for duplicates.
  • 9447
    JCarlin6 said:
    You don't need the public for this you could easily create an algorithm using python to compare pixel to pixel image differences to search for duplicates.
    There's just so much wrong about that…
  • 9459
    Gandalf JCarlin6 Pixel-by-pixel or byte-by-byte is a really really bad idea. Wallhaven currently hosts more than 400K wallpapers. With an average of almost 700 MB for 1K wallpapers. Not to mention you'd have to retrieve each uploaded image ---> read its data--->Compare it with all wallpapers?!!! Also, one could simply convert the image. So, you have to spend more than 6 ~7days to compare a single WP with the other 400K this would result a huge impact on both disk(reading from disk) and memory(running the application) The best you can do is to categorize every uploaded image by uploader /size/author(png)/type/tags/date created/dimensions Long story short, it is tooooo late!
    Last updated
  • 9470
    Look, there are ways to find duplicates quickly. They are just a little more complicated than "compare all the pixels". We're going to be using IQDB, which should be able to find similar wallpapers quickly enough. It's just a bit annoying to implement because IQDB is a standalone program (not a library) and doesn't come with a handy PHP Interface. But we'll get there. ^^
  • 9472
  • 9474
    throated said:
    I can't find anything about detecting duplicate images on there, although I guess that could help fill out tags and purity.
  • 9488
    Gandalf , It doesn't have to be complicated. Besides, you can implement it after the alpha phase is over.
  • 9490
    Better dupe detection will hit this year for sure. We won't leave alpha without it being implemented.
  • 9492
    Holy, nothing has to be complicated, but it often ends up being that way when you want to achieve several things within a specific system. We never came to a firm conclusion amongst ourselves about the extent to which we should tolerate dupes, and this thread is evidence that we don't all have the same ideas about what constitutes a dupe in the first place. That's a separate issue but informs what's implemented. Solid dupe detection is one of the niggles we'd like sorted in plenty time before alpha is over (and has actually been an issue since pretty much the first week). Like you said yourself, it's too late to go about this via certain methods, which is part of the reason we'll be starting fresh again in whatever capacity. Anyway, thanks to you guys for chipping in so far...
  • 9499
    AksumkA That's amazing! byebye alpha! cfunk
    it often ends up being that way when you want to achieve several things within a specific system.
    I wouldn't know.
    we don't all have the same ideas about what constitutes a dupe in the first place
    1) Not searching if an image already exists. 2) Increasing the uploaded images count.
    we'll be starting fresh again in whatever capacity.
    Mwahah!!! More png-24!
  • 9508
    I think that not all of these should be deleted, or, at least two should be preserved. [40800]
    2500 x 138127
    2500 x 152724
    [100216] 100216 (last one) is probably the worst copy, doesn't have the watermark and is of the smallest size. Maybe mark other three as "other size", that would be my suggestion. I even had this as a wallpaper for some time.
    Last updated
  • 9549
    Vozho, this is the kind of situation where we'd group, which is what I've done (but #2 and #3 are just barely different enough).