Comments (10)
Changing types based on string lengths makes it too hard to infer the types of these rather common operations. Instead, we should have the option to wrap a string as BigString(s) if s might be large, and BigString can use the memory-saving versions of these operations.
There's not much difference between print_to_string and StringBuilder. with_output_stream can be skipped in some cases by using write() with an explicit destination argument. It's also nice to be able to write output directly to an I/O endpoint without building a temporary string first.
Simple string building cases can also be handled by pushing characters into an array.
from julia.
Makes sense. I can make the BigString change easily.
Is this an argument for continuing to implement core string building functionality by writing the printing version first and then defining the string creating version by applying print_to_string to the printing version?
from julia.
Somewhat, but multiple approaches can be used. For example, if you're just
combining strings and characters you can use write() instead of going
through print_to_string. We might want to provide some nicer names for
memio, takebuf_string, and write, and make it look more like StringBuilder.
Or for something like strcat I would determine the size of the result,
allocate it once, and use memcpy.
The trouble is that if I do something like
write(io, strcat(a,b,c))
what you ideally want is to write each string without forming the temporary.
Even if strcat is written using an i/o buffer you don't get that
automatically here. I might have to say
strcat_to(io, a, b, c)
but that's not a very nice interface. If a, b, or c is a BigString though,
the strcat is done lazily and you get the desired behavior of writing all
the pieces with no copying. This seems to convince me that there's no
advantage to writing all the string functions in terms of printing. So do
whatever's simplest/fastest/convenient, and let BigString handle other
concerns. How's that sound?
print_escaped is a bit different since we know that a main use of it is
doing output. So strcat etc. doesn't necessarily need to imitate it.
On Tue, May 3, 2011 at 12:38 PM, StefanKarpinski <
[email protected]>wrote:
Makes sense. I can make the BigString change easily.
Is this an argument for continuing to implement core string building
functionality by writing the printing version first and then defining the
string creating version by applying print_to_string to the printing version?Reply to this email directly or view it on GitHub:
#3 (comment)
from julia.
This seems like a 2.0 thing.
from julia.
We're actually pretty good on this at this point. All strcat
and string ref
(substring) operations on ASCIIString
and UTF8String
objects use memcpy
now, so they're fast and they don't create exotic string objects (RopeString, SubString, etc.). Repeating a string does create a
RepString` object, but I think that's probably acceptable. I could make a copying implementation of that rather easily.
If someone wants to use a StringBuilder
pattern, they can write the printing version and then use print_to_string
on it. I feel like that's a reasonable approach if one is worried about strcat
efficiency, with the added bonus of providing a version of the same functionality that can print without having to build a string at all.
I think this issue is not fully addressed, but well enough for v1.0 for now. Will reassign to v2.0.
from julia.
Can I replace memcpy(a)
with copy(a)
?
from julia.
Is copy(a::Array{Uint8,1})
as efficient as memcpy
is?
from julia.
It should be now that we changed copy_to
to use memcpy
for arrays where possible.
from julia.
We can get rid of memcpy
entirely then. I'll do it.
from julia.
We also need to experiment with some sizes at which memcpy is faster. It is actually slower for small arrays. Copy_to should have these smarts.
On 10-Jul-2011, at 12:43 AM, [email protected] wrote:
It should be now that we changed
copy_to
to usememcpy
for arrays where possible.Reply to this email directly or view it on GitHub:
#3 (comment)
from julia.
Related Issues (20)
- 2.0 Remove `copy(x)` for immutables
- `Method` is public, but not documented HOT 3
- exported symbols shouldn't be public by default HOT 2
- `@code_native` is often unreadable HOT 4
- Should remove or formalize `Base.delete`
- Should `PersistentDict` and `HAMT` be public from Base or not? HOT 3
- 1.11.0-alpha2: precompilation fails if `JULIA_DEPOT_PATH` is set HOT 1
- 1.11.0-alpha2: `unsafe_copyto!` does check array bounds HOT 1
- Improve error message for `indexed_iterate` `BoundsError`
- AbstractQ literal_pow (Q^2) missing
- `view(::Memory, ::UnitRange{UInt})` is broken
- doc: clarify whether `@time` allocations include compilation HOT 1
- `CodeUnits <: DenseVector`, but does not fulfill its only criteria HOT 2
- special case copyto!(::Bidiagonal, ::Bidiagonal)
- [REPL] Autocompletion of `using` multiple submodules is broken HOT 2
- Document `Int(::AbstractChar)`
- What does `sizeof` return? HOT 9
- Make pastable REPL modes extensible
- Full source build for Julia 1.10.2 has broken HOT 2
- Full source build uses unstable archives
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from julia.