Page 1 of 2 12 LastLast
Results 1 to 10 of 12

Thread: A suggestion for the scan speed increase

  1. #1
    Member
    Join Date
    Oct 2006
    Posts
    66

    Default A suggestion for the scan speed increase

    Hello,

    This is my suggestion for increasing the scan speed mechanism.

    Well, there are many AV/AS out there which have millions of detection pattern. For example Avira detects more than a million malwares, kaspersky have about 14 lakh malware signature detection, etc etc etc. But they still have a cool scanning speed (Courtesy: AV-Comparatives)


    As I have suggested this before too, I would be great if spybot scans for the malwares track-wise (HDD tracks), a same file (example shell32.dll) gets scanned over and over for different malware signatures.

    The scan is performed with signatures as the reference point. How about scanning a file and comparing it with the list of signatures?

    FYI, it's the scanning method that most of the AV/AS uses out there.

  2. #2
    Member
    Join Date
    Oct 2006
    Posts
    66

    Default

    Here is document that will help you understand what I mean.

    Please excuse me for the hand-made document.

    A picture says a thousand words.


    Last edited by xpsunny; 2008-12-07 at 10:10.

  3. #3
    Member
    Join Date
    Oct 2006
    Posts
    66

    Default

    This one gives a better overview of the technology.


    http://img242.imageshack.us/img242/2659/scan0001ic0.png
    Last edited by xpsunny; 2008-12-07 at 10:39.

  4. #4
    Member
    Join Date
    Oct 2006
    Posts
    66

    Default

    Please neglect post #2.

  5. #5
    Member
    Join Date
    Oct 2006
    Posts
    66

    Default

    Please delete posts #2, #4, #5.

  6. #6
    Member of Team Spybot PepiMK's Avatar
    Join Date
    Oct 2005
    Location
    Planet Earth
    Posts
    3,601

    Default

    Well, it's not that simple I'm afraid

    You forgot the arrows going back from file 1 and three (row 2) into signature 2 (row 1) for example, see last part of Wiki: Algo-Prefix for example. Some patterns are partially defined by re-using previous scan results. Add to that more complexity by thinking about cross-dependencies with the registry scan.

    Plus a few more things Tomorrows 2.0 blog post will deal with the problem of various scanner concepts.
    Just remember, love is life, and hate is living death.
    Treat your life for what it's worth, and live for every breath
    (Black Sabbath: A National Acrobat)

  7. #7
    Member
    Join Date
    Oct 2006
    Posts
    66

    Default

    Quote Originally Posted by PepiMK View Post
    Well, it's not that simple I'm afraid

    You forgot the arrows going back from file 1 and three (row 2) into signature 2 (row 1) for example, see last part of Wiki: Algo-Prefix for example. Some patterns are partially defined by re-using previous scan results. Add to that more complexity by thinking about cross-dependencies with the registry scan.

    Plus a few more things Tomorrows 2.0 blog post will deal with the problem of various scanner concepts.
    Are you referring to screenshot 2? Please neglect the first screenshot.

    You said, "You forgot the arrows going back from file 1 and three (row 2) into signature 2 (row 1)", well that's what I actually wanted to convey it to you dear! The current mechanism scans only the most probable zone of infection.

    Let me explain it to you, for example: Signature 1 contains the detection code of few infected (by trojan.fujack) system32 files, lets assume it's shell.dll, shell32.dll,etc. Now assume that the names of the files to be 1,3,etc. Hence the signature 1 will only scan the files are specifically targeted by a specific malware.

    Now Signature 2 scans for infected shell32.dll file only.

    Conclusion: shell32.dll file gets scanned twice for two different signatures.

    If you see the row 3 and row 2, the file scanning works on a simple conditional sequence "If.....else".
    Last edited by xpsunny; 2008-12-07 at 16:29.

  8. #8
    Member
    Join Date
    Oct 2006
    Posts
    66

    Default

    Well I wrote a simple "if..else" ladder for detection of malware based on C language via MD5 detection

    For information puorpose only, since I am a 1st Sem student, I am a newbie in the field of programming, so sorry in advance for errors in programming (if they exist)


    /*Program to detect malware based on MD5 code*/

    #include<stdio.h>
    #include<conio.h>
    define ID NTFS stream /*subprocedure call*/
    define MD5 SignatureBase /*subprocedure call*/
    define IDMD5 MD5 /*subprocedure call*/

    void main()
    {
    char MD5[i], ID[j], IDMD5[k];
    int i,j,k;

    for (i=0; i<=999999;i++)
    {
    for (j=0; i<=999999;j++)
    {
    for (j=0; i<=999999;j++)
    {
    if (MD5[i]==IDMD5[k])
    {
    Procedure Call "del"; /*predeclared procedure*/
    }
    else
    {
    Prodecure Call "skip" ; /*predeclared procedure*/
    }
    }
    }
    }
    }

  9. #9
    Member of Team Spybot PepiMK's Avatar
    Join Date
    Oct 2005
    Location
    Planet Earth
    Posts
    3,601

    Default

    Nope, it's far more complicated than that

    Standard MD5 hashes can be used only on static files. An MD5 could match only a single instance of a possibly very random file. If we would detect all files by MD5, the detection database would be hundreds of MB in size, which is unacceptable and would still not cover many variants that are so morphing that you just could never collect all possible variants! If you take a look here, you'll notice dozens of other parameters that can be used instead of MD5 (and those are just the public ones), where sometimes a single parameter would replace thousands of MD5s.

    And no file will get scanned twice, "even" in a (good) pattern based scanner. Next to the obvious method, caching of results, we use a hybrid approach that is far from the linearity that we display in the form of malware names during the scan, which simply is a simplification since the actuol progress could not be easily shown in 2D.

    You also misunderstood the "arrows back" I'm afraid. Take a look at this simple example (not real syntax but a bit simplified and constructed OpenSBI syntax):
    Code:
    File:"test","<$WINDIR>\<regexpr>([^\.]*\.exe)","filesize=12345,md=ABCD..."
    RegyKey:"test",HKLM,"\Test\","<regexpr>([a-z]*)","testvalue=<$REGMATCH1>"
    File:"test","<$WINDIR>\<$REGMATCH1>","..."
    There'e a registry key that depends on a files name, and another file that depends on the registry keys name.

    In pure filesystem/registry iteration, that would imply that the registry scanner would have to wait until the file scanner has completed scanning everything, but at the same time the file scanner has to wait until the registry scanner has completely finished - a situation that could not be solved - the scanner would simply hang forever. A good equivalent in coding would be deadlocks in threads (see also semaphores etc.). The simple iteration approach would thus have to give up any dependencies between detections, which would slow down everything even more and might make detection of a few of the worst malwares impossible.

    What you describe is the old AV concept, which does not take the registry and the modern complexity of malware into account
    Last edited by PepiMK; 2008-12-07 at 22:38.
    Just remember, love is life, and hate is living death.
    Treat your life for what it's worth, and live for every breath
    (Black Sabbath: A National Acrobat)

  10. #10
    Member
    Join Date
    Oct 2006
    Posts
    66

    Default

    Okay.....Now I understand what you were trying to say.

    Well, in which programming language is Spybot written?

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •