Hi,
I saw your post on reddit and thought I'd try and give you some pointers.
How to create a modified 2D array for the recursion? In the end I cloned it and modified the clone but this feels awkward
Generally people tend to stick with immutable data structures in Scala. In this case that would be List
and Vector
instead of Array
. The reasoning here is that immutable data structures can be shared and reused without ever having to worry about accidentally mutating something that's used elsewhere. They are implemented in a way that updates are reasonably efficient as they share the common aspects to the origin data structure (this is called "persistent data structures").
In your case, index-lookup is heavily used so in theory Vector
would be slightly better than List
, however with a length of less than 10 elements even that might not make much of a difference. If in doubt you should always measure before you optimize!
Sometimes a mutable data structure used locally can bring performance benefits, but if you're using backtracking it's unclear if you can use a mutable structure without cloning.
Personally, I almost never have lists with many elements, so I just use List
and don't worry much about it.
(More info here: http://www.lihaoyi.com/post/BenchmarkingScalaCollections.html)
WartRemover disallows the use of ==, so I defined the type-safe === operator they recommend - but isn't this built-in anywhere?
There are several implementations with slight differences. If you use ScalaTest, it has a ===
operator for assertions. Typelevel "cats" solves the issue with the Eq typeclass and provides a ===
operator as syntax extension as well. scalaz
probably has a similar thing if that's your jam.
If you're using InteliiJ, it will also give you a warning whenever you compare unrelated types with ==
. Not a real solution, but worth pointing out.
I don't use sets to get the valid candidates for the empty position, but instead map the checking function across the list 1 to 9. Does Scala use lazy evaluation (so that it would know to not check the later conditions if the row condition is not met for example) or is this less efficient than getting the intersection of the sets that satisfy each separate condition (row unique, column unique, small square unique)?
In general Scala is an eager language. But there are many ways to achieve laziness if needed (e.g. lazy val
, functions, "by-name parameters", and certain abstractions in libraries like cats.effect.IO
or monix.eval.Task
).
In your program it might be even simpler by using a different method to check the number:
// instead of doing this to find out if the canditate number is already in the column...
!(a(row)
.map((x: Int) => x === candidate) // map will always go through the whole list
.foldLeft(false)((x:Boolean, y:Boolean) => x | y) // foldLeft also goes through the whole list
// ...use ".exists" and ".forall" to check if a property is true for "at least one" or "all" elements respectively
// they stop as soon as possible and are considered more idiomatic for this use-case
!a(row).exists(x => x === candidate)
Since you mentioned "Spark":
With Spark you shouldn't worry too much about low-level optimizations. As long as you use the datatypes provided by Spark (e.g. Dataset
) and their transformations, it should be optimized reasonably well.
The bigger problem tends to be the APIs of Spark that can make it difficult to write idiomatic, typesafe Scala.
If you stumble over those things, you could take a look at Frameless, which tries to make the APIs safer to use.
Alright, hope this was at least somewhat helpful. If I didn't answer your questions properly or you have more questions, feel free to tell me.
Good luck on your adventures with Scala and Spark! ๐
~ Felix