Git Product home page Git Product logo

Comments (4)

Single430 avatar Single430 commented on June 16, 2024 1

Hello, I also encountered the same problem, how did you solve it?@dagongji10

Just delete this sentence.

cps = np.array(cps)[...]

from bezier_curve_text_spotting.

dagongji10 avatar dagongji10 commented on June 16, 2024

I have modified single_demo_bezier.py like this:

#from maskrcnn_benchmark.modeling.utils import cat
#from maskrcnn_benchmark.layers import BezierAlign
from detectron2.layers import cat
from adet.layers import BezierAlign

and got the result, but it's wrong:
image

My code is:

from PIL import Image, ImageOps
import numpy as np
import json

import torch
from torch import nn

#from maskrcnn_benchmark.modeling.utils import cat
#from maskrcnn_benchmark.layers import BezierAlign
from detectron2.layers import cat
from adet.layers import BezierAlign


class Model(nn.Module):
    def __init__(self, input_size, output_size, scale):
        super(Model, self).__init__()
        self.bezier_align = BezierAlign(output_size, scale, 1)
        self.masks = nn.Parameter(torch.ones(input_size, dtype=torch.float32))

    def forward(self, input, rois):
        # apply mask
        x = input * self.masks
        rois = self.convert_to_roi_format(rois)
        return self.bezier_align(x, rois)

    def convert_to_roi_format(self, beziers):
        concat_boxes = cat([b for b in beziers], dim=0)
        device, dtype = concat_boxes.device, concat_boxes.dtype
        ids = cat(
            [
                torch.full((len(b), 1), i, dtype=dtype, device=device)
                for i, b in enumerate(beziers)
            ],
            dim=0,
        )
        rois = torch.cat([ids, concat_boxes], dim=1)
        return rois


def get_size(image_size, w, h):
    w_ratio = w / image_size[1]
    h_ratio = h / image_size[0]
    down_scale = max(w_ratio, h_ratio)
    if down_scale > 1:
        return down_scale
    else:
        return 1


def test(scale=1):
    image_size = (2560, 2560)  # H x W
    output_size = (256, 1024)

    input_size = (image_size[0] // scale,
                  image_size[1] // scale)
    m = Model(input_size, output_size, 1 / scale).cuda()

    beziers = [[]]
    im_arrs = []
    down_scales = []
    
    imgfile = '1019.jpg'
    im = Image.open('imgs/' + imgfile)
    # im.show()
    # pad
    w, h = im.size
    down_scale = get_size(image_size, w, h)
    down_scales.append(down_scale)
    if down_scale > 1:
        im = im.resize((int(w / down_scale), int(h / down_scale)), Image.ANTIALIAS)
        w, h = im.size
    padding = (0, 0, image_size[1] - w, image_size[0] - h)
    im = ImageOps.expand(im, padding)
    im = im.resize((input_size[1], input_size[0]), Image.ANTIALIAS)
    im_arrs.append(np.array(im))

    cps = [152.0, 209.0, 134.1, 34.18, 365.69, 66.2, 377.0, 206.0, 345.0, 214.0, 334.31, 109.71, 190.03, 80.12, 203.0, 214.0] # 1019

    cps = np.array(cps)[[1, 0, 3, 2, 5, 4, 7, 6, 15, 14, 13, 12, 11, 10, 9, 8]]
    beziers[0].append(cps)

    beziers = [torch.from_numpy(np.stack(b)).cuda().float() for b in beziers]
    beziers = [b / d for b, d in zip(beziers, down_scales)]

    im_arrs = np.stack(im_arrs)
    x = torch.from_numpy(im_arrs).permute(0, 3, 1, 2).cuda().float()

    x = m(x, beziers)
    for i, roi in enumerate(x):
        roi = roi.cpu().detach().numpy().transpose(1, 2, 0).astype(np.uint8)
        im = Image.fromarray(roi, "RGB")
        im.save('roi_1103.png')
    loss = x.mean()
    loss.backward()
    print(m)


test(1)

@Yuliang-Liu Can you help me?

from bezier_curve_text_spotting.

Single430 avatar Single430 commented on June 16, 2024

Hello, I also encountered the same problem, how did you solve it?@dagongji10

from bezier_curve_text_spotting.

dagongji10 avatar dagongji10 commented on June 16, 2024

Hello, I also encountered the same problem, how did you solve it?@dagongji10

Just delete this sentence.

cps = np.array(cps)[...]

Thanks. I will have a try.

from bezier_curve_text_spotting.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.