-
Notifications
You must be signed in to change notification settings - Fork 62
X'[1,:]
much slower than X[:, 1]
, despite theoretically same access pattern
#628
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
Fortunately, because it's Julia, I can just define something like function Base.getindex(M::LinearAlgebra.Adjoint{T, <:SparseMatrixCSC{T}}, i::Int, ::Colon) where {T}
getindex(M.parent, :, i)
end and then it is fast. However, I think the current behaviour is quite unexpected, and perhaps something similar could be merged upstream? Happy to prepare something if there is interest. |
This is technically wrong, you need to adjoint recursively on the parent as well, but you are welcome to open a pull request. |
That's right, it should be something like adjoint.(getindex(M.parent, :, i)) This is suboptimal, because it allocates two vectors. Unfortunately, something like adjoint.(view(M.parent, :, i)) seems to create a dense vector. Looks like we're missing some broadcasting optimization for single columns of CSC-matrices. |
Given JuliaLang/julia#40632 and JuliaLang/julia#31677 p.s.
|
Uh oh!
There was an error while loading. Please reload this page.
Expected behaviour
When I allocate a
SparseMatrixCSC
, indexing columns is fast and indexing rows slow. E.g.,Further, when I create the lazy adjoint, then the behaviour should switch
and in particular, I'd expect the timings for each "fast" and "slow" to approximately match.
Actual behaviour
Unfortunately, although
Adjoint{SparseMatrixCSC}
"simulates" aSparseMatrixCSR
, the indexing is quite slow forXt[1, :]
.In particular:
Notably,
Xt[1,:]
is about 100 times slower thanX[:, 1]
although the same elements are indexed, and the data structure should be the same.I have already tried to understand what's happening by
Cthulhu.@descend
ing down the indexing callstack of the lazy transpose, but the generated functions are too difficult to see through for me.Why do I need this
I have an application where sparse matrix indexing performance is critical. For parts of the algorithm I need to retrieve columns quickly (no problem), and for some parts I need to retrieve rows quickly. It would therefore be nice to just explicitly compute the transpose once, and then index into that, i.e.
X = copy(transpose(X))'
. Then I don't need to change any of the existing code, but row lookup should be fast. However, as you can see above, it is not.The text was updated successfully, but these errors were encountered: