jerryli27 / aniseg Goto Github PK

View Code? Open in Web Editor NEW

201.0 201.0 15.0 44.61 MB

A faster-rcnn model for anime character segmentation.

License: Apache License 2.0

Python 99.57% Shell 0.43%

aniseg's People

Contributors

Stargazers

Watchers

Forkers

lychees s-nuttapong kanokkorn hakanaku1234 sugasuga-p vcan libao0412 lumaceon arachchi gokulsg taziksh czakuou yuanxinz nat-chan inarikami

aniseg's Issues

Better Figure Segmentation through Edge Detection?

This may sound weird, but is it possible to use edges of regions to redefine figure segmentation to make it more accurate?

Figure segmentation dataset

Thanks for you work!
May I ask how you generated the dataset for figure segmentation?

README: link 'Danbooru2019 Figures', 855k crops using AniSeg

I've run AniSeg on Danbooru2019 solo SFW images to generate a large corpus of close-up single-character whole-body/profile images to help our BigGAN learn bodies. The dataset, and my code for how to generate it, is likely of interest to AniSeg users. Writeup: https://www.gwern.net/Crops#danbooru2019-figures

training images

I'm trying some other models, and I would like to try them on your datasets to achieve better results.

To train the model, we overlayed segmented anime figures on top of pure background images to create an artificial dataset. We found this gives a decent performance. You can find pure background images in the Danbooru 2018 dataset. Please contact us if you'd like to use our pre-generated tfrecords.

Could you release the pre-generated tfrecords? Or maybe the original background photos and scripts for composing this dataset?

I can not download anime segmentation dataset.

In README.md, you write the download link, which is https://github.com/jerryli27/AniSeg/blob/master.
However, I think this is wrong path.
Would I ask the correct the dataset link?

Not working in colab: Attempted to map inputs that were not found in graph_def: [image_tensor:0]

Hi,

I'm trying to run this project in colab and getting the error listed below.

Here is an example colab to reproduce the issue:
https://colab.research.google.com/drive/1eo7Lz3eL8ML6BchlB84MyEOReh0YsuKI?usp=sharing

Is this a bug or am I doing something wrong?
Thanks in advance.

/content/AniSeg/object_detection/inference/mask_inference.py:123: RuntimeWarning: Unexpected end-group tag: Not all data was converted
  graph_def.MergeFromString(graph_content)
Traceback (most recent call last):
  File "infer_from_image.py", line 139, in <module>
    tf.app.run()
  File "/tensorflow-1.15.2/python3.7/tensorflow_core/python/platform/app.py", line 40, in run
    _run(main=main, argv=argv, flags_parser=_parse_flags_tolerate_undef)
  File "/usr/local/lib/python3.7/dist-packages/absl/app.py", line 300, in run
    _run_main(main, args)
  File "/usr/local/lib/python3.7/dist-packages/absl/app.py", line 251, in _run_main
    sys.exit(main(argv))
  File "infer_from_image.py", line 85, in main
    image_tensor, FLAGS.inference_graph, override_num_detections=FLAGS.override_num_detections)
  File "/content/AniSeg/object_detection/inference/mask_inference.py", line 126, in build_inference_graph
    graph_def, name='', input_map={'image_tensor': image_tensor})
  File "/tensorflow-1.15.2/python3.7/tensorflow_core/python/util/deprecation.py", line 507, in new_func
    return func(*args, **kwargs)
  File "/tensorflow-1.15.2/python3.7/tensorflow_core/python/framework/importer.py", line 405, in import_graph_def
    producer_op_list=producer_op_list)
  File "/tensorflow-1.15.2/python3.7/tensorflow_core/python/framework/importer.py", line 535, in _import_graph_def_internal
    ', '.join(missing_unused_input_keys))
ValueError: Attempted to map inputs that were not found in graph_def: [image_tensor:0]

Dataset size

Hello.

How many images were in the dataset you used to train?
I want to know if its worth it for me to try to make a bigger one, or if it's just too big a task.

Thanks!

Update SciPy image-saving dependency

Running crashes because imsave is no longer provided by SciPy. It can be imported from a different library now:

diff --git a/util_io.py b/util_io.py
index 170a21c..96a63bc 100644
--- a/util_io.py
+++ b/util_io.py
@@ -23,6 +23,7 @@ import scipy.misc
 import tensorflow as tf
 from PIL import Image
 from typing import Union
+import imageio
 
 
 ###########
@@ -134,7 +135,7 @@ def imsave(path, img):
   img = np.clip(img, 0, 255).astype(np.uint8)
   if len(img.shape) == 3 and img.shape[-1] == 1:
     img = np.squeeze(img, -1)
-  scipy.misc.imsave(path, img)
+  imageio.imwrite(path, img)

deepomf's figure segmentation dataset

If you're interested in doing further training, deeppomf has an image dataset scraped from yiff.party (includes NSFW images) where PSDs are used to generate pixel-perfect segmentations of figures: https://drive.google.com/file/d/1vK7R_FiD3_pC892sWTy_fOA_HF6E2SKx/view

Artists provide PSDs in layers, where 1 layer is the figure and the rest the background; so pixel-perfect-by-construction segmentation masks can be automatically created for training.

How to set up an environment to run this?

I can't find instructions on how to install the necessary runtime/libraries etc?

Why is there no requirements.txt?

This is an excellent project. I've managed to use it, but it took me hours to get the appropriate libraries installed and working.

UnicodeDecodeError: 'utf-8' codec can't decode byte 0xff in position 394: invalid start byte

As I understand, tensorflow cant load checkpoint
Code and full trace:
https://colab.research.google.com/drive/1zQSmHVrB0VN6d7I16PZk6ED9oRIHMQc1

Feature: support cropping images using inferred bounding boxes

It would be good if the provided script could crop images down to detected figures/faces instead of just providing JSON & visualizations. For the purpose of adding data augmentation to our Danbooru2019 BigGAN to help it learn solo figures (we already have a cropped portrait dataset, which improved learning of faces noticeably), I added a pass to infer_from_image.py which looks like this:

    if len((result["detection_score"]))==1 and (result["detection_score"])[0] > FLAGS.min_score_thresh:
      # result = {'detection_score': [0.9958623647689819], 'detection_bbox_ymin': [0.11348748952150345], 'detection_bbox_xmin': [0.6218132972717285], 'detection_bbox_ymax': [0.3206212520599365], 'detection_bbox_xmax': [0.8703262805938721], 'detection_class_label': [1], 'annotated_image': array([[[255, 255, 255], ...
      base, ext = os.path.splitext(os.path.basename(image_path))
      output_crop = os.path.join(FLAGS.output_path, base + '_crop.png')
      idims = image_np.shape # np array with shape (height, width, num_color(1, 3, or 4))
      min_x = min(round(result["detection_bbox_xmin"][0] * idims[1]), idims[1])
      max_x = min(round(result["detection_bbox_xmax"][0] * idims[1]), idims[1])
      min_y = min(round(result["detection_bbox_ymin"][0] * idims[0]), idims[0])
      max_y = min(round(result["detection_bbox_ymax"][0] * idims[0]), idims[0])
      image_cropped = image_np[min_y:max_y, min_x:max_x, :]

A cleaned-up and configurable version which crop out each bounding box would be a good addition.

On a side note, it'd be nice if this would use both my GPUs. I'm also not sure this is properly minibatching: it seems a lot slower than I'd expect, and the GPU utilization in nvidia-smi is a lot bouncier and usually <100%.