Git Product home page Git Product logo

Comments (4)

ndm13 avatar ndm13 commented on August 30, 2024

Also, it seems like long regex strings cause crashes. scallion [bcdfghjklmnpqrstvwxyz][aeiou][aeiou][bcdfghjklmnpqrstvwxyz] runs just fine, yet scallion [bcdfghjklmnpqrstvwxyz][aeiou][aeiou][bcdfghjklmnpqrstvwxyz][aeiou] compiles, runs a hash check, and crashes. Anything longer than that just hangs at Compiling. If there's a length limit, I'm not sure where to find it.

Output of the latter:

Cooking up some delicions scallions...
Using kernel optimized from file kernel.cl (Optimized4)
Using work group size 128
Compiling kernel... done.
Testing SHA1 hash...
CPU SHA-1: d3486ae9136e7856bc42212385ea797094475802
GPU SHA-1: d3486ae9136e7856bc42212385ea797094475802
Looks good!
0x07452B91 (0x06346A00 0x077457D0 0x077457C4 0xD4D03B92) <unknown module>
0x07452B91 (0x06346A00 0x0023E158 0x0023E14C 0xD3878D0A) <unknown module>
0x5E06A0D0 (0x06346A00 0x077457D0 0x077457C4 0x00000000)0x5E06A0D0 (0x06346
A00 0x0023E158 0x0023E14C 0x00010009)

0x64D6DAC3 (0x00010009 0x00000000 0x00000000 0x0053A618)0x64D6DAC3 (0x00000
000 0x00000000 0x00000000 0x0053A618)

0x671EE1F2 (0x0023E200 0xD3814762 0x04E13E80 0x04E07920)0x671EE1F2 (0x07745
878 0xD4D6FD9A 0x04E03E80 0x04E07D20), ?GetTaskExecutor@TaskExecutor@OpenCL
@Intel@@YAPAVITaskExecutor@123@XZ() + 0x7AF2 bytes(s)
0x671E85FF (0x04E07920 0x04E07924 0x705B9EED 0xD38117AD), ?GetTaskExecutor@
TaskExecutor@OpenCL@Intel@@YAPAVITaskExecutor@123@XZ() + 0x1EFF bytes(s)
0x671EE9FC (0x04E1F720 0x04E07920 0x705B6A0F 0xD3811639), ?GetTaskExecutor@
TaskExecutor@OpenCL@Intel@@YAPAVITaskExecutor@123@XZ() + 0x82FC bytes(s)
0x671E52D0 (0x7760E0F2 0x00500840 0x00000070 0x00000000)
, ?GetTaskExecutor@TaskExecutor@OpenCL@Intel@@YAPAVITaskExecutor@123@XZ() +
 0x7AF2 bytes(s)
0x7760E38C (0x00500840 0x00000070 0x00000000 0x0023E360)0x671E85FF (0x04E07
D20 0x04E07D24 0x705B9EED 0xD4D6AC35), ?GetTaskExecutor@TaskExecutor@OpenCL
@Intel@@YAPAVITaskExecutor@123@XZ() + 0x1EFF bytes(s)
0x671EE9FC (0x04E06F20 0x04E07D20 0xD4D6AC21 0x07745AA8), ?GetTaskExecutor@
TaskExecutor@OpenCL@Intel@@YAPAVITaskExecutor@123@XZ() + 0x82FC bytes(s)
0x705B9760 (0x04E07D20 0x04E07D1C 0xD4D6FFB6 0x00000000), ?internal_wait@ta
sk_arena_base@internal@interface7@tbb@@IBEXXZ() + 0x2B20 bytes(s)
, RtlInitUnicodeString() + 0x164 bytes(s)
0x7760E0F2 (0x0023E398 0x06DBB5F0 0x0023E388 0x75350DBB), RtlAllocateHeap()
 + 0xAC bytes(s)
0x75350DA1 (0x00000000 0x05A8C708 0x00500840 0x0023E3B0), CreateEventExW()
+ 0x6E bytes(s)
0x75350DBB (0x00000000 0xD38146FA 0x00500860 0x00500874), CreateEventExW()
+ 0x88 bytes(s)
0x671E6629 (0x671E5F03 0xD3814122 0x00500840 0x00500840)
0x671E5EF5 (0x0023E450 0x64D7BD37 0x06DBB648 0x64D73951)
0x671E59FE (0xD3812530 0x05E5FFA0 0x058571A0 0x00000000)
0x64D73951 (0x07A8ABA8 0x6682A6DB 0xD3814048 0x06DAD804), clDevInitDeviceAg
ent() + 0x581 bytes(s)
0x6684DB57 (0x06DAD818 0x0023E4D4 0x66834102 0x6683411E), clWaitForEvents()
 + 0x70A47 bytes(s)
0x6682B42B (0x6683411E 0xD3814014 0xD38141E6 0x00000002), clWaitForEvents()
 + 0x4E31B bytes(s)
0x66834102 (0xD38141A4 0x06DAD790 0x06DAD800 0x058571A4), clWaitForEvents()
 + 0x56FF2 bytes(s)
0x6683465B (0x07A8AA98 0x02CF00EC 0x005575E3 0x66862590), clWaitForEvents()
 + 0x5754B bytes(s)
0x7760E0F2 (0xFFEEFFEE 0x00000000 0x05210010 0x004E00A8), RtlAllocateHeap()
 + 0xAC bytes(s)
0x01003310 (0x00000000 0x05210010 0x004E00A8 0x004E0000) <unknown module>
0xFFEEFFEE (0x05210010 0x004E00A8 0x004E0000 0x004E0000) <unknown module>
0xFFEEFFEE (0x05210010 0x004E00A8 0x004E0000 0x004E0000) <unknown module>

from scallion.

richardklafter avatar richardklafter commented on August 30, 2024

Scallion can not run the regexes on the GPU. It converts them to a list of strings in https://github.com/lachesis/scallion/blob/gpg/scallion/RegexPattern.cs. We simply did not implement ranges or repeating patterns. If your decent at C# it would be awesome if ya added those features.

The other issue you mentioned is a practical bug. Scallion converts a regex to list of strings. If these strings are short (less then 7 or 8 characters) the strings are checked on the GPU. If the strings are long they are checked on the CPU.

In your last example, [bcdfghjklmnpqrstvwxyz][aeiou][aeiou][bcdfghjklmnpqrstvwxyz][aeiou], all the generated strings are short and are all checked on the GPU. Further, the set of generated strings is huge:
"bcdfghjklmnpqrstvwxyz".length*"aeiou".length*"aeiou".length*"bcdfghjklmnpqrstvwxyz".length*"aeiou".length = 55125 possibilities

Try the pattern aaaaaaaa[bcdfghjklmnpqrstvwxyz][aeiou][aeiou][bcdfghjklmnpqrstvwxyz][aeiou] which will should run. This example only produces a single string that is checked on the GPU.

In summary, Scallion was written to find longer patterns. The regex support was added to allow you to search for variations and increase your odds of finding a collision. If you are looking for a shorter pattern don't use regexes because scallion can easily find a short collision.

That being said, scallion really should print out a warning if its going to try to check 55125 patterns on the GPU :P

from scallion.

ndm13 avatar ndm13 commented on August 30, 2024

Hey, thanks a lot. I was looking to get either alphabetic or word-like results, but it seems that this isn't feasible without running many threads because the string list would be massive! I wasn't aware of this, thanks for clarifying. C# isn't something I'm experienced in, so sorry I can't help out.

from scallion.

lachesis avatar lachesis commented on August 30, 2024

For your use case, if you only goal is to get a prefix which follows the pattern "consonant vowel vowel consonant vowel", you're probably better off using shallot, or using scallion with a shorter pattern and the "-c" option and then filtering afterwards.

from scallion.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.