[go: nahoru, domu]

Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Assigning SVector and Tuples are allocating #1015

Open
charleskawczynski opened this issue Nov 13, 2022 · 6 comments
Open

Assigning SVector and Tuples are allocating #1015

charleskawczynski opened this issue Nov 13, 2022 · 6 comments
Labels
bug Something isn't working performance

Comments

@charleskawczynski
Copy link
Member

Here's a MWE:

using Test

import StaticArrays
import ClimaCore
using ClimaCore: Geometry, Domains, Meshes, Topologies, Spaces, Fields

function toy_sphere(::Type{FT}) where {FT}
    radius = FT(1e7)
    zmax = FT(1e4)
    helem = npoly = 2
    velem = 4

    hdomain = Domains.SphereDomain(radius)
    hmesh = Meshes.EquiangularCubedSphere(hdomain, helem)
    htopology = Topologies.Topology2D(hmesh)
    quad = Spaces.Quadratures.GLL{npoly + 1}()
    hspace = Spaces.SpectralElementSpace2D(htopology, quad)

    vdomain = Domains.IntervalDomain(
        Geometry.ZPoint{FT}(zero(FT)),
        Geometry.ZPoint{FT}(zmax);
        boundary_tags = (:bottom, :top),
    )
    vmesh = Meshes.IntervalMesh(vdomain, nelems = velem)
    # center_space = Spaces.CenterFiniteDifferenceSpace(vmesh)

    vspace = Spaces.CenterFiniteDifferenceSpace(vmesh)

    # TODO: Replace this with a space that includes topography.
    center_space = Spaces.ExtrudedFiniteDifferenceSpace(hspace, vspace)
    # center_coords = Fields.coordinate_field(center_space)
    face_space = Spaces.FaceExtrudedFiniteDifferenceSpace(center_space)

    # face_space = Spaces.FaceFiniteDifferenceSpace(center_space)
    return (;center_space,face_space)
end

function field_vec(center_space, face_space)
    Y = Fields.FieldVector(
        c = map(Fields.coordinate_field(center_space)) do coord
                FT = Spaces.undertype(center_space)
                (;
                    uₕ = Geometry.Covariant12Vector(FT(0), FT(0)),
                    uₕ_phys = StaticArrays.SVector(FT(0), FT(0)),
                    uₕ_phys_tup = (FT(0), FT(0)),
                )
            end
    )
    return Y
end

run_svec!(Y) = run_svec!(Y.c.uₕ_phys, Y.c.uₕ)
function run_svec!(uₕ_phys, uₕ)
    @. uₕ_phys = StaticArrays.SVector(
        Geometry.UVVector(uₕ).components.data.:1,
        Geometry.UVVector(uₕ).components.data.:2,
    )
    return nothing
end

run_tup!(Y) = run_tup!(Y.c.uₕ_phys_tup, Y.c.uₕ)
function run_tup!(uₕ_phys_tup, uₕ)
    @. uₕ_phys_tup = tuple(
        Geometry.UVVector(uₕ).components.data.:1,
        Geometry.UVVector(uₕ).components.data.:2,
    )
    return nothing
end

function test_alloc(Y)
    run_svec!(Y) # compile first
    p = @allocated run_svec!(Y)
    @test p == 0

    run_tup!(Y) # compile first
    p = @allocated run_tup!(Y)
    @test p == 0
end

(;center_space,face_space) = toy_sphere(Float32)
Y = field_vec(center_space,face_space)

@testset "Allocations" begin
    test_alloc(Y)
end
nothing
@charleskawczynski charleskawczynski added bug Something isn't working performance labels Nov 13, 2022
bors bot added a commit that referenced this issue Jul 6, 2023
1016: Add broken alloc tests for assigning SArrays and tuples r=charleskawczynski a=charleskawczynski

This PR adds a broken allocation test for when we assign SArrays and Tuples. Related issue: #1015.

Co-authored-by: Charles Kawczynski <kawczynski.charles@gmail.com>
@kmdeck
Copy link
Contributor
kmdeck commented Sep 22, 2023

@charleskawczynski I tried something similar and it seems like Tuples and NamedTuples are OK now:

using ClimaCore
using ClimaComms
import ClimaCore: Fields, Spaces, Meshes, Topologies, Domains
using BenchmarkTools

FT = Float64
# set up the space
context = ClimaComms.SingletonCommsContext()
radius = FT(3)
ne = 100
Nq = 4
mesh = Meshes.EquiangularCubedSphere(Domains.SphereDomain(radius), ne)
topology = Topologies.Topology2D(context, mesh)
quad = Spaces.Quadratures.GLL{Nq}()
space = Spaces.SpectralElementSpace2D(topology, quad)

# the types to try. In the last case, we would have two fields, y and z, 
# of floats, instead of a single field with two values at each point
types = (Tuple{FT,FT}, NamedTuple{(:y,:z), Tuple{FT,FT}}, FT)

function return_tuple(val)
    return (val, val)
end

function return_named_tuple(val)
    return (;y = val, z = val)
end

# Methods for broadcasting for the different types
function set_field!(output, val, type::Type{NamedTuple{T, Tuple{FT,FT}}}) where {T, FT}
    @.  output = return_named_tuple(val)
end

function set_field!(output, val, type::Type{Tuple{FT,FT}}) where {FT}
    @.  output = return_tuple(val)
end

function set_field!(y,z, val, type::Type{FT}) where {FT <: AbstractFloat}
   # not sure how to avoid this allocating step
    foo = return_tuple.(val)
    y .= foo.:1
    z .= foo.:2
end

x = Fields.Field(FT, space)
x .= FT(1.0)

@show ("Tuple")
v1 = Fields.Field(types[1], space)
@btime set_field!(v1, x, types[1])

@show ("NamedTuple")
v2 = Fields.Field(types[2], space)
@btime set_field!(v2, x, types[2])

@show ("Floats")
y = similar(x)
z = similar(x)
@btime set_field!(y,z, x, types[3])

with output
"Tuple" = "Tuple"
1.587 ms (0 allocations: 0 bytes)
"NamedTuple" = "NamedTuple"
1.587 ms (0 allocations: 0 bytes)
"Floats" = "Floats"
5.676 ms (3 allocations: 14.65 MiB)

@charleskawczynski
Copy link
Member Author

Can you try

    y .= return_tuple.(val).:1
    z .= return_tuple.(val).:2

?

@kmdeck
Copy link
Contributor
kmdeck commented Sep 22, 2023

Yes! I changed to this:

function set_field!(y,z, val, type::Type{FT}) where {FT <: AbstractFloat}
    #foo = return_tuple.(val)
    y .= return_tuple.(val).:1
    z .= return_tuple.(val).:2
end

and it seems ~2x as worse?

"Floats" = "Floats"
8.177 ms (6 allocations: 29.30 MiB)

Maybe because it now allocates a Tuple valued field twice, and just extracts one element from it each time for the broadcast assignment statements?

@charleskawczynski
Copy link
Member Author

The last one in theory shouldn't be allocating, so I guess it's still an issue. I'm pretty sure that using a struct in this situation will not result in allocations here.

@kmdeck
Copy link
Contributor
kmdeck commented Sep 25, 2023

The last one in theory shouldn't be allocating, so I guess it's still an issue. I'm pretty sure that using a struct in this situation will not result in allocations here.

I am missing something, Im sure, but return_tuple makes and returns a tuple valued field. But y and z are both fields of floats. so doesnt memory need to be allocated to hold the result of return_tuple? and then we just index into that to set y and z?

@charleskawczynski
Copy link
Member Author

That's true, but it should be stack allocated because only one component is ever needed in the final expression.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working performance
Projects
None yet
Development

No branches or pull requests

2 participants