Skip to content

CODExperiments

Doing it MY way!

  • Home
  • 2010
  • April
  • 15
  • Duplicate Image Finder … sort of!

Duplicate Image Finder … sort of!

Posted on April 15, 2010January 2, 2015 By Zoon No Comments on Duplicate Image Finder … sort of!
C#, Programming

Going through my “huge” archive of digital camera images, I noticed that I had some duplicate images. The filename differs but the content is the same. Now I could just download a tool from the almighty internet, that could compare all my images, but that would be going to far! Why not give it a go myself?



Several approaches come to mind. I could start cataloging all the files and then compare filesizes. If the filesize match, I should then compare the content of the files. If the content matches, then the file should be marked a duplicate.

I could also Google for an advanced image comparison algorithm and spend the next week trying to implement it.

I took the easy way out 🙂 First I catalog all the image files (*.jpg). Then I calculated the MD5 hash of every file. Before all that, I had created an SQL Compact Edition 3.5 database containing two fields. Filename and MD5 hash. Only one index was created, MD5 hash must be unique.

[ad name=”Google Adsense-1″]

So every time I tried to insert a record into the database, and the MD5 hash already existed, I would get an exception. In the exception handling code I rename the file to “DUP_”+original filename. Naturally I could have deleted it on the spot, but I wanted to make sure the file really was a dup 🙂

Anyways, code speak louder than words
[sourcecode language=”csharp”]
// Create a new instance of the MD5CryptoServiceProvider
MD5 md5Hasher = MD5.Create();

// Get all files, including subdirectories (recursively)
string[] files = Directory.GetFiles( @"path_to_imagefiles_here/",
"*.jpg",
SearchOption.AllDirectories );

// loop through all files
BinaryReader br = null;
foreach( string file in files )
{
try
{
br = new BinaryReader( new FileStream( file, FileMode.Open ) );
FileInfo fi = new FileInfo( file );
byte[] fileData = new byte[(int)fi.Length];
// read content of file
br.Read( fileData, 0, fileData.Length );
br.Close();

// compute the MD5 hash
byte[] md5Data = md5Hasher.ComputeHash( fileData );

// create the MD5 human readable string
StringBuilder sBuilder = new StringBuilder();

// Loop through each byte of the hashed data
// and format each one as a hexadecimal string.
for( int i = 0; i < md5Data.Length; i++ )
{
sBuilder.Append( md5Data[i].ToString( "x2" ) );
}

if( !insertRow( file, sBuilder.ToString() ) )
{
// rename the file
File.Move( file, string.Format("{0}\\{1}{2}",
fi.DirectoryName,
"DUP_",
fi.Name) );
}
}
}
[/sourcecode]

Granted, this might not be the best way but it only took about 10 minutes to write… and it works for me 🙂

Tags: C# SQL

Post navigation

❮ Previous Post: Family increase
Next Post: You want salt with that? ❯

You may also like

Programming
NZBcc
August 14, 2013
Programming
NZB Completion Checker 1.1.0.0 alpha update #1
March 4, 2012
Programming
Time for some statistics
August 31, 2012
Photo Mosaic
Photo Mosaic
March 21, 2010

Leave a Reply

You must be logged in to post a comment.

Downloads

Get it while it's HOT:
NZBcc 1.2.2.0

Downloads
might be ad supported. If so, click in top right corner to continue when the counter reaches 0.

Support NZBcc



Sign up to Dropbox using this link

Archives

  • March 2016
  • September 2014
  • July 2014
  • January 2014
  • August 2013
  • January 2013
  • December 2012
  • November 2012
  • September 2012
  • August 2012
  • June 2012
  • May 2012
  • March 2012
  • February 2012
  • January 2012
  • December 2011
  • November 2011
  • October 2011
  • August 2011
  • July 2011
  • April 2011
  • January 2011
  • December 2010
  • November 2010
  • August 2010
  • July 2010
  • April 2010
  • March 2010
  • January 2009

Tag Cloud

AJAX Android Articles C# Download FMV Giganews HLSL HTML HTML5 Javascript jQuery MPEG Decoding MySQL Netflix NZB NZBcc PHP SQL Usenet Windows 7 XNA

Copyright © 2025 CODExperiments.

Theme: Oceanly News Dark by ScriptsTown