Julia, like most technical computing languages, provides a firstclass array implementation. Most technical computing languages pay a lot of attention to their array implementation at the expense of other containers. Julia does not treat arrays in any special way. The array library is implemented almost completely in Julia itself, and derives its performance from the compiler, just like any other code written in Julia. As such, it's also possible to define custom array types by inheriting from AbstractArray.
See the manual section on the AbstractArray interface for more details on implementing a custom array type.
An array is a collection of objects stored in a multidimensional grid. In the most general case, an array may contain objects of type Any
. For most computational purposes, arrays should contain objects of a more specific type, such as Float64
or Int32
.
In general, unlike many other technical computing languages, Julia does not expect programs to be written in a vectorized style for performance. Julia's compiler uses type inference and generates optimized code for scalar array indexing, allowing programs to be written in a style that is convenient and readable, without sacrificing performance, and using less memory at times.
In Julia, all arguments to functions are passed by reference. Some technical computing languages pass arrays by value, and this is convenient in many cases. In Julia, modifications made to input arrays within a function will be visible in the parent function. The entire Julia array library ensures that inputs are not modified by library functions. User code, if it needs to exhibit similar behavior, should take care to create a copy of inputs that it may modify.
Function  Description 

eltype(A) 
the type of the elements contained in A

length(A) 
the number of elements in A

ndims(A) 
the number of dimensions of A

size(A) 
a tuple containing the dimensions of A

size(A,n) 
the size of A along dimension n

indices(A) 
a tuple containing the valid indices of A

indices(A,n) 
a range expressing the valid indices along dimension n

eachindex(A) 
an efficient iterator for visiting each position in A

stride(A,k) 
the stride (linear index distance between adjacent elements) along dimension k

strides(A) 
a tuple of the strides in each dimension 
Many functions for constructing and initializing arrays are provided. In the following list of such functions, calls with a dims...
argument can either take a single tuple of dimension sizes or a series of dimension sizes passed as a variable number of arguments. Most of these functions also accept a first input T
, which is the element type of the array. If the type T
is omitted it will default to Float64
.
Function  Description 

Array{T}(dims...) 
an uninitialized dense Array

zeros(T, dims...) 
an Array of all zeros 
zeros(A) 
an array of all zeros with the same type, element type and shape as A

ones(T, dims...) 
an Array of all ones 
ones(A) 
an array of all ones with the same type, element type and shape as A

trues(dims...) 
a BitArray with all values true

trues(A) 
a BitArray with all values true and the same shape as A

falses(dims...) 
a BitArray with all values false

falses(A) 
a BitArray with all values false and the same shape as A

reshape(A, dims...) 
an array containing the same data as A , but with different dimensions 
copy(A) 
copy A

deepcopy(A) 
copy A , recursively copying its elements 
similar(A, T, dims...) 
an uninitialized array of the same type as A (dense, sparse, etc.), but with the specified element type and dimensions. The second and third arguments are both optional, defaulting to the element type and dimensions of A if omitted. 
reinterpret(T, A) 
an array with the same binary data as A , but with element type T

rand(T, dims...) 
an Array with random, iid [1] and uniformly distributed values in the halfopen interval $[0, 1)$

randn(T, dims...) 
an Array with random, iid and standard normally distributed values 
eye(T, n) 
n byn identity matrix 
eye(T, m, n) 
m byn identity matrix 
linspace(start, stop, n) 
range of n linearly spaced elements from start to stop

fill!(A, x) 
fill the array A with the value x

fill(x, dims...) 
an Array filled with the value x

iid, independently and identically distributed.
The syntax [A, B, C, ...]
constructs a 1d array (vector) of its arguments. If all arguments have a common promotion type then they get converted to that type using convert()
.
Arrays can be constructed and also concatenated using the following functions:
Function  Description 

cat(k, A...) 
concatenate input nd arrays along the dimension k

vcat(A...) 
shorthand for cat(1, A...)

hcat(A...) 
shorthand for cat(2, A...)

Scalar values passed to these functions are treated as 1element arrays.
The concatenation functions are used so often that they have special syntax:
Expression  Calls 

[A; B; C; ...] 
vcat() 
[A B C ...] 
hcat() 
[A B; C D; ...] 
hvcat() 
hvcat()
concatenates in both dimension 1 (with semicolons) and dimension 2 (with spaces).
An array with a specific element type can be constructed using the syntax T[A, B, C, ...]
. This will construct a 1d array with element type T
, initialized to contain elements A
, B
, C
, etc. For example Any[x, y, z]
constructs a heterogeneous array that can contain any values.
Concatenation syntax can similarly be prefixed with a type to specify the element type of the result.
julia> [[1 2] [3 4]] 1×4 Array{Int64,2}: 1 2 3 4 julia> Int8[[1 2] [3 4]] 1×4 Array{Int8,2}: 1 2 3 4
Comprehensions provide a general and powerful way to construct arrays. Comprehension syntax is similar to set construction notation in mathematics:
A = [ F(x,y,...) for x=rx, y=ry, ... ]
The meaning of this form is that F(x,y,...)
is evaluated with the variables x
, y
, etc. taking on each value in their given list of values. Values can be specified as any iterable object, but will commonly be ranges like 1:n
or 2:(n1)
, or explicit arrays of values like [1.2, 3.4, 5.7]
. The result is an Nd dense array with dimensions that are the concatenation of the dimensions of the variable ranges rx
, ry
, etc. and each F(x,y,...)
evaluation returns a scalar.
The following example computes a weighted average of the current element and its left and right neighbor along a 1d grid. :
julia> x = rand(8) 8element Array{Float64,1}: 0.843025 0.869052 0.365105 0.699456 0.977653 0.994953 0.41084 0.809411 julia> [ 0.25*x[i1] + 0.5*x[i] + 0.25*x[i+1] for i=2:length(x)1 ] 6element Array{Float64,1}: 0.736559 0.57468 0.685417 0.912429 0.8446 0.656511
The resulting array type depends on the types of the computed elements. In order to control the type explicitly, a type can be prepended to the comprehension. For example, we could have requested the result in single precision by writing:
Float32[ 0.25*x[i1] + 0.5*x[i] + 0.25*x[i+1] for i=2:length(x)1 ]
Comprehensions can also be written without the enclosing square brackets, producing an object known as a generator. This object can be iterated to produce values on demand, instead of allocating an array and storing them in advance (see Iteration). For example, the following expression sums a series without allocating memory:
julia> sum(1/n^2 for n=1:1000) 1.6439345666815615
When writing a generator expression with multiple dimensions inside an argument list, parentheses are needed to separate the generator from subsequent arguments:
julia> map(tuple, 1/(i+j) for i=1:2, j=1:2, [1:4;]) ERROR: syntax: invalid iteration specification
All commaseparated expressions after for
are interpreted as ranges. Adding parentheses lets us add a third argument to map
:
julia> map(tuple, (1/(i+j) for i=1:2, j=1:2), [1 3; 2 4]) 2×2 Array{Tuple{Float64,Int64},2}: (0.5, 1) (0.333333, 3) (0.333333, 2) (0.25, 4)
Ranges in generators and comprehensions can depend on previous ranges by writing multiple for
keywords:
julia> [(i,j) for i=1:3 for j=1:i] 6element Array{Tuple{Int64,Int64},1}: (1, 1) (2, 1) (2, 2) (3, 1) (3, 2) (3, 3)
In such cases, the result is always 1d.
Generated values can be filtered using the if
keyword:
julia> [(i,j) for i=1:3 for j=1:i if i+j == 4] 2element Array{Tuple{Int64,Int64},1}: (2, 2) (3, 1)
The general syntax for indexing into an ndimensional array A is:
X = A[I_1, I_2, ..., I_n]
where each I_k
may be a scalar integer, an array of integers, or any other supported index. This includes Colon
(:
) to select all indices within the entire dimension, ranges of the form a:c
or a:b:c
to select contiguous or strided subsections, and arrays of booleans to select elements at their true
indices.
If all the indices are scalars, then the result X
is a single element from the array A
. Otherwise, X
is an array with the same number of dimensions as the sum of the dimensionalities of all the indices.
If all indices are vectors, for example, then the shape of X
would be (length(I_1), length(I_2), ..., length(I_n))
, with location (i_1, i_2, ..., i_n)
of X
containing the value A[I_1[i_1], I_2[i_2], ..., I_n[i_n]]
. If I_1
is changed to a twodimensional matrix, then X
becomes an n+1
dimensional array of shape (size(I_1, 1), size(I_1, 2), length(I_2), ..., length(I_n))
. The matrix adds a dimension. The location (i_1, i_2, i_3, ..., i_{n+1})
contains the value at A[I_1[i_1, i_2], I_2[i_3], ..., I_n[i_{n+1}]]
. All dimensions indexed with scalars are dropped. For example, the result of A[2, I, 3]
is an array with size size(I)
. Its i
th element is populated by A[2, I[i], 3]
.
As a special part of this syntax, the end
keyword may be used to represent the last index of each dimension within the indexing brackets, as determined by the size of the innermost array being indexed. Indexing syntax without the end
keyword is equivalent to a call to getindex
:
X = getindex(A, I_1, I_2, ..., I_n)
Example:
julia> x = reshape(1:16, 4, 4) 4×4 Base.ReshapedArray{Int64,2,UnitRange{Int64},Tuple{}}: 1 5 9 13 2 6 10 14 3 7 11 15 4 8 12 16 julia> x[2:3, 2:end1] 2×2 Array{Int64,2}: 6 10 7 11 julia> x[1, [2 3; 4 1]] 2×2 Array{Int64,2}: 5 9 13 1
Empty ranges of the form n:n1
are sometimes used to indicate the interindex location between n1
and n
. For example, the searchsorted()
function uses this convention to indicate the insertion point of a value not found in a sorted array:
julia> a = [1,2,5,6,7]; julia> searchsorted(a, 3) 3:2
The general syntax for assigning values in an ndimensional array A is:
A[I_1, I_2, ..., I_n] = X
where each I_k
may be a scalar integer, an array of integers, or any other supported index. This includes Colon
(:
) to select all indices within the entire dimension, ranges of the form a:c
or a:b:c
to select contiguous or strided subsections, and arrays of booleans to select elements at their true
indices.
If X
is an array, it must have the same number of elements as the product of the lengths of the indices: prod(length(I_1), length(I_2), ..., length(I_n))
. The value in location I_1[i_1], I_2[i_2], ..., I_n[i_n]
of A
is overwritten with the value X[i_1, i_2, ..., i_n]
. If X
is not an array, its value is written to all referenced locations of A
.
Just as in Indexing, the end
keyword may be used to represent the last index of each dimension within the indexing brackets, as determined by the size of the array being assigned into. Indexed assignment syntax without the end
keyword is equivalent to a call to setindex!()
:
setindex!(A, X, I_1, I_2, ..., I_n)
Example:
julia> x = collect(reshape(1:9, 3, 3)) 3×3 Array{Int64,2}: 1 4 7 2 5 8 3 6 9 julia> x[1:2, 2:3] = 1 1 julia> x 3×3 Array{Int64,2}: 1 1 1 2 1 1 3 6 9
In the expression A[I_1, I_2, ..., I_n]
, each I_k
may be a scalar index, an array of scalar indices, or an object that represents an array of scalar indices and can be converted to such by to_indices
:
A scalar index. By default this includes:
Nonboolean integers
CartesianIndex{N}
s, which behave like an N
tuple of integers spanning multiple dimensions (see below for more details)
An array of scalar indices. This includes:
Vectors and multidimensional arrays of integers
Empty arrays like []
, which select no elements
Range
s of the form a:c
or a:b:c
, which select contiguous or strided subsections from a
to c
(inclusive)
Any custom array of scalar indices that is a subtype of AbstractArray
Arrays of CartesianIndex{N}
(see below for more details)
An object that represents an array of scalar indices and can be converted to such by to_indices
. By default this includes:
Colon()
(:
), which represents all indices within an entire dimension or across the entire array
Arrays of booleans, which select elements at their true
indices (see below for more details)
The special CartesianIndex{N}
object represents a scalar index that behaves like an N
tuple of integers spanning multiple dimensions. For example:
julia> A = reshape(1:32, 4, 4, 2); julia> A[3, 2, 1] 7 julia> A[CartesianIndex(3, 2, 1)] == A[3, 2, 1] == 7 true
Considered alone, this may seem relatively trivial; CartesianIndex
simply gathers multiple integers together into one object that represents a single multidimensional index. When combined with other indexing forms and iterators that yield CartesianIndex
es, however, this can lead directly to very elegant and efficient code. See Iteration below, and for some more advanced examples, see this blog post on multidimensional algorithms and iteration.
Arrays of CartesianIndex{N}
are also supported. They represent a collection of scalar indices that each span N
dimensions, enabling a form of indexing that is sometimes referred to as pointwise indexing. For example, it enables accessing the diagonal elements from the first "page" of A
from above:
julia> page = A[:,:,1] 4×4 Array{Int64,2}: 1 5 9 13 2 6 10 14 3 7 11 15 4 8 12 16 julia> page[[CartesianIndex(1,1), CartesianIndex(2,2), CartesianIndex(3,3), CartesianIndex(4,4)]] 4element Array{Int64,1}: 1 6 11 16
This can be expressed much more simply with dot broadcasting and by combining it with a normal integer index (instead of extracting the first page
from A
as a separate step). It can even be combined with a :
to extract both diagonals from the two pages at the same time:
julia> A[CartesianIndex.(indices(A, 1), indices(A, 2)), 1] 4element Array{Int64,1}: 1 6 11 16 julia> A[CartesianIndex.(indices(A, 1), indices(A, 2)), :] 4×2 Array{Int64,2}: 1 17 6 22 11 27 16 32
CartesianIndex
and arrays of CartesianIndex
are not compatible with the end
keyword to represent the last index of a dimension. Do not use end
in indexing expressions that may contain either CartesianIndex
or arrays thereof.
Often referred to as logical indexing or indexing with a logical mask, indexing by a boolean array selects elements at the indices where its values are true
. Indexing by a boolean vector B
is effectively the same as indexing by the vector of integers that is returned by find(B)
. Similarly, indexing by a N
dimensional boolean array is effectively the same as indexing by the vector of CartesianIndex{N}
s where its values are true
. A logical index must be a vector of the same length as the dimension it indexes into, or it must be the only index provided and match the size and dimensionality of the array it indexes into. It is generally more efficient to use boolean arrays as indices directly instead of first calling find()
.
julia> x = reshape(1:16, 4, 4) 4×4 Base.ReshapedArray{Int64,2,UnitRange{Int64},Tuple{}}: 1 5 9 13 2 6 10 14 3 7 11 15 4 8 12 16 julia> x[[false, true, true, false], :] 2×4 Array{Int64,2}: 2 6 10 14 3 7 11 15 julia> mask = map(ispow2, x) 4×4 Array{Bool,2}: true false false false true false false false false false false false true true false true julia> x[mask] 5element Array{Int64,1}: 1 2 4 8 16
The recommended ways to iterate over a whole array are
for a in A # Do something with the element a end for i in eachindex(A) # Do something with i and/or A[i] end
The first construct is used when you need the value, but not index, of each element. In the second construct, i
will be an Int
if A
is an array type with fast linear indexing; otherwise, it will be a CartesianIndex
:
julia> A = rand(4,3); julia> B = view(A, 1:3, 2:3); julia> for i in eachindex(B) @show i end i = CartesianIndex{2}((1, 1)) i = CartesianIndex{2}((2, 1)) i = CartesianIndex{2}((3, 1)) i = CartesianIndex{2}((1, 2)) i = CartesianIndex{2}((2, 2)) i = CartesianIndex{2}((3, 2))
In contrast with for i = 1:length(A)
, iterating with eachindex
provides an efficient way to iterate over any array type.
If you write a custom AbstractArray
type, you can specify that it has fast linear indexing using
Base.IndexStyle(::Type{<:MyArray}) = IndexLinear()
This setting will cause eachindex
iteration over a MyArray
to use integers. If you don't specify this trait, the default value IndexCartesian()
is used.
The following operators are supported for arrays:
Unary arithmetic – 
, +
Binary arithmetic – 
, +
, *
, /
, \
, ^
Comparison – ==
, !=
, ≈
(isapprox
), ≉
Most of the binary arithmetic operators listed above also operate elementwise when one argument is scalar: 
, +
, and *
when either argument is scalar, and /
and \
when the denominator is scalar. For example, [1, 2] + 3 == [4, 5]
and [6, 4] / 2 == [3, 2]
.
Additionally, to enable convenient vectorization of mathematical and other operations, Julia provides the dot syntax f.(args...)
, e.g. sin.(x)
or min.(x,y)
, for elementwise operations over arrays or mixtures of arrays and scalars (a Broadcasting operation); these have the additional advantage of "fusing" into a single loop when combined with other dot calls, e.g. sin.(cos.(x))
.
Also, every binary operator supports a dot version that can be applied to arrays (and combinations of arrays and scalars) in such fused broadcasting operations, e.g. z .== sin.(x .* y)
.
Note that comparisons such as ==
operate on whole arrays, giving a single boolean answer. Use dot operators like .==
for elementwise comparisons. (For comparison operations like <
, only the elementwise .<
version is applicable to arrays.)
Also notice the difference between max.(a,b)
, which broadcast
s max()
elementwise over a
and b
, and maximum(a)
, which finds the largest value within a
. The same relationship holds for min.(a,b)
and minimum(a)
.
It is sometimes useful to perform elementbyelement binary operations on arrays of different sizes, such as adding a vector to each column of a matrix. An inefficient way to do this would be to replicate the vector to the size of the matrix:
julia> a = rand(2,1); A = rand(2,3); julia> repmat(a,1,3)+A 2×3 Array{Float64,2}: 1.20813 1.82068 1.25387 1.56851 1.86401 1.67846
This is wasteful when dimensions get large, so Julia offers broadcast()
, which expands singleton dimensions in array arguments to match the corresponding dimension in the other array without using extra memory, and applies the given function elementwise:
julia> broadcast(+, a, A) 2×3 Array{Float64,2}: 1.20813 1.82068 1.25387 1.56851 1.86401 1.67846 julia> b = rand(1,2) 1×2 Array{Float64,2}: 0.867535 0.00457906 julia> broadcast(+, a, b) 2×2 Array{Float64,2}: 1.71056 0.847604 1.73659 0.873631
Dotted operators such as .+
and .*
are equivalent to broadcast
calls (except that they fuse, as described below). There is also a broadcast!()
function to specify an explicit destination (which can also be accessed in a fusing fashion by .=
assignment), and functions broadcast_getindex()
and broadcast_setindex!()
that broadcast the indices before indexing. Moreover, f.(args...)
is equivalent to broadcast(f, args...)
, providing a convenient syntax to broadcast any function (dot syntax). Nested "dot calls" f.(...)
(including calls to .+
etcetera) automatically fuse into a single broadcast
call.
Additionally, broadcast()
is not limited to arrays (see the function documentation), it also handles tuples and treats any argument that is not an array, tuple or Ref
(except for Ptr
) as a "scalar".
julia> convert.(Float32, [1, 2]) 2element Array{Float32,1}: 1.0 2.0 julia> ceil.((UInt8,), [1.2 3.4; 5.6 6.7]) 2×2 Array{UInt8,2}: 0x02 0x04 0x06 0x07 julia> string.(1:3, ". ", ["First", "Second", "Third"]) 3element Array{String,1}: "1. First" "2. Second" "3. Third"
The base array type in Julia is the abstract type AbstractArray{T,N}
. It is parametrized by the number of dimensions N
and the element type T
. AbstractVector
and AbstractMatrix
are aliases for the 1d and 2d cases. Operations on AbstractArray
objects are defined using higher level operators and functions, in a way that is independent of the underlying storage. These operations generally work correctly as a fallback for any specific array implementation.
The AbstractArray
type includes anything vaguely arraylike, and implementations of it might be quite different from conventional arrays. For example, elements might be computed on request rather than stored. However, any concrete AbstractArray{T,N}
type should generally implement at least size(A)
(returning an Int
tuple), getindex(A,i)
and getindex(A,i1,...,iN)
; mutable arrays should also implement setindex!()
. It is recommended that these operations have nearly constant time complexity, or technically Õ(1) complexity, as otherwise some array functions may be unexpectedly slow. Concrete types should also typically provide a similar(A,T=eltype(A),dims=size(A))
method, which is used to allocate a similar array for copy()
and other outofplace operations. No matter how an AbstractArray{T,N}
is represented internally, T
is the type of object returned by integer indexing (A[1, ..., 1]
, when A
is not empty) and N
should be the length of the tuple returned by size()
.
DenseArray
is an abstract subtype of AbstractArray
intended to include all arrays that are laid out at regular offsets in memory, and which can therefore be passed to external C and Fortran functions expecting this memory layout. Subtypes should provide a method stride(A,k)
that returns the "stride" of dimension k
: increasing the index of dimension k
by 1
should increase the index i
of getindex(A,i)
by stride(A,k)
. If a pointer conversion method Base.unsafe_convert(Ptr{T}, A)
is provided, the memory layout should correspond in the same way to these strides.
The Array
type is a specific instance of DenseArray
where elements are stored in columnmajor order (see additional notes in Performance Tips). Vector
and Matrix
are aliases for the 1d and 2d cases. Specific operations such as scalar indexing, assignment, and a few other basic storagespecific operations are all that have to be implemented for Array
, so that the rest of the array library can be implemented in a generic manner.
SubArray
is a specialization of AbstractArray
that performs indexing by reference rather than by copying. A SubArray
is created with the view()
function, which is called the same way as getindex()
(with an array and a series of index arguments). The result of view()
looks the same as the result of getindex()
, except the data is left in place. view()
stores the input index vectors in a SubArray
object, which can later be used to index the original array indirectly. By putting the @views
macro in front of an expression or block of code, any array[...]
slice in that expression will be converted to create a SubArray
view instead.
StridedVector
and StridedMatrix
are convenient aliases defined to make it possible for Julia to call a wider range of BLAS and LAPACK functions by passing them either Array
or SubArray
objects, and thus saving inefficiencies from memory allocation and copying.
The following example computes the QR decomposition of a small section of a larger array, without creating any temporaries, and by calling the appropriate LAPACK function with the right leading dimension size and stride parameters.
julia> a = rand(10,10) 10×10 Array{Float64,2}: 0.561255 0.226678 0.203391 0.308912 … 0.750307 0.235023 0.217964 0.718915 0.537192 0.556946 0.996234 0.666232 0.509423 0.660788 0.493501 0.0565622 0.118392 0.493498 0.262048 0.940693 0.252965 0.0470779 0.736979 0.264822 0.228787 0.161441 0.897023 0.567641 0.343935 0.32327 0.795673 0.452242 0.468819 0.628507 0.511528 0.935597 0.991511 0.571297 0.74485 … 0.84589 0.178834 0.284413 0.160706 0.672252 0.133158 0.65554 0.371826 0.770628 0.0531208 0.306617 0.836126 0.301198 0.0224702 0.39344 0.0370205 0.536062 0.890947 0.168877 0.32002 0.486136 0.096078 0.172048 0.77672 0.507762 0.573567 0.220124 0.165816 0.211049 0.433277 0.539476 julia> b = view(a, 2:2:8,2:2:4) 4×2 SubArray{Float64,2,Array{Float64,2},Tuple{StepRange{Int64,Int64},StepRange{Int64,Int64}},false}: 0.537192 0.996234 0.736979 0.228787 0.991511 0.74485 0.836126 0.0224702 julia> (q,r) = qr(b); julia> q 4×2 Array{Float64,2}: 0.338809 0.78934 0.464815 0.230274 0.625349 0.194538 0.527347 0.534856 julia> r 2×2 Array{Float64,2}: 1.58553 0.921517 0.0 0.866567
Julia has builtin support for sparse vectors and sparse matrices. Sparse arrays are arrays that contain enough zeros that storing them in a special data structure leads to savings in space and execution time, compared to dense arrays.
In Julia, sparse matrices are stored in the Compressed Sparse Column (CSC) format. Julia sparse matrices have the type SparseMatrixCSC{Tv,Ti}
, where Tv
is the type of the stored values, and Ti
is the integer type for storing column pointers and row indices. The internal representation of SparseMatrixCSC
is as follows:
struct SparseMatrixCSC{Tv,Ti<:Integer} <: AbstractSparseMatrix{Tv,Ti} m::Int # Number of rows n::Int # Number of columns colptr::Vector{Ti} # Column i is in colptr[i]:(colptr[i+1]1) rowval::Vector{Ti} # Row indices of stored values nzval::Vector{Tv} # Stored values, typically nonzeros end
The compressed sparse column storage makes it easy and quick to access the elements in the column of a sparse matrix, whereas accessing the sparse matrix by rows is considerably slower. Operations such as insertion of previously unstored entries one at a time in the CSC structure tend to be slow. This is because all elements of the sparse matrix that are beyond the point of insertion have to be moved one place over.
All operations on sparse matrices are carefully implemented to exploit the CSC data structure for performance, and to avoid expensive operations.
If you have data in CSC format from a different application or library, and wish to import it in Julia, make sure that you use 1based indexing. The row indices in every column need to be sorted. If your SparseMatrixCSC
object contains unsorted row indices, one quick way to sort them is by doing a double transpose.
In some applications, it is convenient to store explicit zero values in a SparseMatrixCSC
. These are accepted by functions in Base
(but there is no guarantee that they will be preserved in mutating operations). Such explicitly stored zeros are treated as structural nonzeros by many routines. The nnz()
function returns the number of elements explicitly stored in the sparse data structure, including structural nonzeros. In order to count the exact number of numerical nonzeros, use countnz()
, which inspects every stored element of a sparse matrix. dropzeros()
, and the inplace dropzeros!()
, can be used to remove stored zeros from the sparse matrix.
julia> A = sparse([1, 2, 3], [1, 2, 3], [0, 2, 0]) 3×3 SparseMatrixCSC{Int64,Int64} with 3 stored entries: [1, 1] = 0 [2, 2] = 2 [3, 3] = 0 julia> dropzeros(A) 3×3 SparseMatrixCSC{Int64,Int64} with 1 stored entry: [2, 2] = 2
Sparse vectors are stored in a close analog to compressed sparse column format for sparse matrices. In Julia, sparse vectors have the type SparseVector{Tv,Ti}
where Tv
is the type of the stored values and Ti
the integer type for the indices. The internal representation is as follows:
struct SparseVector{Tv,Ti<:Integer} <: AbstractSparseVector{Tv,Ti} n::Int # Length of the sparse vector nzind::Vector{Ti} # Indices of stored values nzval::Vector{Tv} # Stored values, typically nonzeros end
As for SparseMatrixCSC
, the SparseVector
type can also contain explicitly stored zeros. (See Sparse Matrix Storage.).
The simplest way to create sparse arrays is to use functions equivalent to the zeros()
and eye()
functions that Julia provides for working with dense arrays. To produce sparse arrays instead, you can use the same names with an sp
prefix:
julia> spzeros(3) 3element SparseVector{Float64,Int64} with 0 stored entries julia> speye(3,5) 3×5 SparseMatrixCSC{Float64,Int64} with 3 stored entries: [1, 1] = 1.0 [2, 2] = 1.0 [3, 3] = 1.0
The sparse()
function is often a handy way to construct sparse arrays. For example, to construct a sparse matrix we can input a vector I
of row indices, a vector J
of column indices, and a vector V
of stored values (this is also known as the COO (coordinate) format). sparse(I,J,V)
then constructs a sparse matrix such that S[I[k], J[k]] = V[k]
. The equivalent sparse vector constructor is sparsevec
, which takes the (row) index vector I
and the vector V
with the stored values and constructs a sparse vector R
such that R[I[k]] = V[k]
.
julia> I = [1, 4, 3, 5]; J = [4, 7, 18, 9]; V = [1, 2, 5, 3]; julia> S = sparse(I,J,V) 5×18 SparseMatrixCSC{Int64,Int64} with 4 stored entries: [1 , 4] = 1 [4 , 7] = 2 [5 , 9] = 3 [3 , 18] = 5 julia> R = sparsevec(I,V) 5element SparseVector{Int64,Int64} with 4 stored entries: [1] = 1 [3] = 5 [4] = 2 [5] = 3
The inverse of the sparse()
and sparsevec
functions is findnz()
, which retrieves the inputs used to create the sparse array. There is also a findn
function which only returns the index vectors.
julia> findnz(S) ([1, 4, 5, 3], [4, 7, 9, 18], [1, 2, 3, 5]) julia> findn(S) ([1, 4, 5, 3], [4, 7, 9, 18]) julia> findnz(R) ([1, 3, 4, 5], [1, 5, 2, 3]) julia> findn(R) 4element Array{Int64,1}: 1 3 4 5
Another way to create a sparse array is to convert a dense array into a sparse array using the sparse()
function:
julia> sparse(eye(5)) 5×5 SparseMatrixCSC{Float64,Int64} with 5 stored entries: [1, 1] = 1.0 [2, 2] = 1.0 [3, 3] = 1.0 [4, 4] = 1.0 [5, 5] = 1.0 julia> sparse([1.0, 0.0, 1.0]) 3element SparseVector{Float64,Int64} with 2 stored entries: [1] = 1.0 [3] = 1.0
You can go in the other direction using the Array
constructor. The issparse()
function can be used to query if a matrix is sparse.
julia> issparse(speye(5)) true
Arithmetic operations on sparse matrices also work as they do on dense matrices. Indexing of, assignment into, and concatenation of sparse matrices work in the same way as dense matrices. Indexing operations, especially assignment, are expensive, when carried out one element at a time. In many cases it may be better to convert the sparse matrix into (I,J,V)
format using findnz()
, manipulate the values or the structure in the dense vectors (I,J,V)
, and then reconstruct the sparse matrix.
The following table gives a correspondence between builtin methods on sparse matrices and their corresponding methods on dense matrix types. In general, methods that generate sparse matrices differ from their dense counterparts in that the resulting matrix follows the same sparsity pattern as a given sparse matrix S
, or that the resulting sparse matrix has density d
, i.e. each matrix element has a probability d
of being nonzero.
Details can be found in the Sparse Vectors and Matrices section of the standard library reference.
Sparse  Dense  Description 

spzeros(m,n) 
zeros(m,n) 
Creates a mbyn matrix of zeros. (spzeros(m,n) is empty.) 
spones(S) 
ones(m,n) 
Creates a matrix filled with ones. Unlike the dense version, spones() has the same sparsity pattern as S. 
speye(n) 
eye(n) 
Creates a nbyn identity matrix. 
full(S) 
sparse(A) 
Interconverts between dense and sparse formats. 
sprand(m,n,d) 
rand(m,n) 
Creates a mbyn random matrix (of density d) with iid nonzero elements distributed uniformly on the halfopen interval $[0, 1)$. 
sprandn(m,n,d) 
randn(m,n) 
Creates a mbyn random matrix (of density d) with iid nonzero elements distributed according to the standard normal (Gaussian) distribution. 
sprandn(m,n,d,X) 
randn(m,n,X) 
Creates a mbyn random matrix (of density d) with iid nonzero elements distributed according to the X distribution. (Requires the Distributions package.) 
© 2009–2016 Jeff Bezanson, Stefan Karpinski, Viral B. Shah, and other contributors
Licensed under the MIT License.
https://docs.julialang.org/en/release0.6/manual/arrays/